COMP 420: Project suggestions and inspirations

Spring 2005

Introduction

The following is a sort of "menu" of possible semester projects for COMP420. Naturally, these suggestions are somewhat biased by the interests and research of the teaching staff. You are not restricted to this set of topics, but, of course, more help will be available if you choose a topic of local interest. Please view the suggested topics as rough concepts, not as finalized proposals. You should try to develop your own topic to the extent possible.

Other sources of inspiration can be found in the project suggestions for recent iterations of this course. (You will discover significant overlap with this list.) COMP520 Fall 2004; COMP520 Fall 2003; COMP413 2002/2003.

Given the size of the class, projects should be done in groups of two students (or, if the project merits, a single group of four). A title for your project and the composition of your group is due by March 31. Your project proposal is due by April 12 and the project itself, including written report, is due at the end of the semester.

Remember: The purpose of this project is not to be an enormous programming assignment; it is foremost an opportunity to define and pursue a research project. Ask an interesting question, and form an interesting hypothesis related to the content of this course. Develop the minimal amount of framework to validate or disprove your hypothesis.


Outline of suggested projects

  1. P2POG
  2. Scalable Software Update
  3. Meta-DHT and DHT Range Queries
  4. Version Control
  5. Expo: Shared Whiteboard
  6. Content-Based Publish & Subscribe
  7. Peer-to-Peer Email: ePOST Improvements
  8. Scribe Security
  9. Something Completely Different

Project details

P2POG

Large-scale multiplayer online games have moved from primitive (MUDs) to popular (modern MMOGs). However, as MMOGs gain popularity, the difficulty of providing game service increases. [Recently, the eagerly-awaited World of Warcraft MMOG has been plagued with server instability and even total downtime (graph), forcing refunds to players and very bad press.]

Clearly, these games will become even more popular in the future, and yet the centralized servers which operate them are fundamentally difficult to scale without unfortunate compromises on gameplay (e.g., partitioned "worlds" preventing friends from playing together).

Develop a scalable multiplayer game which shares game state and simulation among peers. It's certainly out of scope of this course to develop a MMOG in the World of Warcraft sense, but a text-based adventure game between peers seems reasonable. Players and objects should be able to integrate and cause meaningful effects; players should be able to communicate with one another by proximity (i.e. "same room") as well as arbitrarily. Players may be offline for periods of time, and the game should (a) tolerate this and (b) maintain state between connections of a given player. Game/object state must be carefully maintained to avoid inconsistency between peers. In-game programmability is always nice, but not required.

There are many approaches to dividing/distributing the game state; implement a design you like and defend it in your report.

Note that the problem of cheating in a p2p game environment is an open one; pay careful attention to the ability of arbitrary clients to corrupt the game state, and even if you don't address this in your implementation, you should discuss this topic in your report.

[cites: "P2P support for MM Games", Knutsson et al., INFOCOM'04; all papers from ACM Network and System Support for Games, 2002; Rooney etc al., on federated game architectures, IEEE Communications 5/04]

Scalable Software Update

In peer-to-peer distributed systems, data is replicated among several nodes to achieve high availability. Ideally, these nodes will be dissimilar in location, software configuration, and administrative control, to reduce the risk of correlated failure impacting all replicas of a given object.

However, the homogeneity of the hosts connected to the Internet allows for extremely massive correlated failures. For example, Windows viruses and worms can take the vast majority of nodes offline. One approach to dealing with this problem is to "fight fire with fire": massive redundancy and aggregation [cite:Glacier] to combat massive correlated failure.

This problem might, however, be substantially decreased if software patches and bug fixes were more immediately and efficiently distributed to vulnerable hosts. Current software update mechanisms, such as Windows Update, suffer distribution delays (because hosts do not query the update server very frequently) and availability issues (due to overload of central update servers).

Develop a distributed software-update mechanism. End hosts should be made aware of software updates within minutes, not days or weeks; software should be distributed efficiently to end hosts, with peers sharing the workload.

Note that since software updates tend to be very large files, storing them in a DHT will likely not be sufficient; look at SplitStream and BitTorrent for ideas about how to more efficiently share large files fairly among many peers.

[cite recent efforts to distribute Windows XP SP2 via BitTorrent, and related legal issues]

Your solution should take into account some of the security implications of peer-to-peer software updates; in particular, hosts must not install, or even perpetuate as authentic, forged or invalid software updates.

[NB: A "flash crowd" capable version of PAST, with chunked content (possibly erasure coded?) cached along the routes to the nodeId closest to the fileId of each chunk, might be a good project on its own...]

Meta-DHT

Distributed hash tables provide a scalable way to store and retrieve data associated with a single key; for instance, the PAST DHT offers the API:

put(fileId, object, credentials)
get(fileId) → object

Extend this API to provide efficient association of arbitrary attributes with objects in PAST.

query(attrName, attrValue) → set<fileId>
setAttr(fileId, attrName, attrValue, credentials)
getAttr(fileId, attrName) → Object
removeAttr(fileId, attrName, credentials)
listAttrs(fileId) → set<attrName>

Build a sample application of Meta-PAST. Example: a keyword-based picture-sharing application in the spirit of Flickr, in which users may store image files and also assign arbitrary "tags" to help organize these pictures. Tag queries may be restricted to a given user's photos, or may be applied to the global photo set.

Other example apps:

Related project: DHT range queries. (A possible approach is to embed a B-Tree in the DHT.)

[cites: Complex Queries in DHTs [UCB], IPTPS'02; Somo, IPTPS'03; others?]

Version Control

Design, implement and evaluate a version control system on top of a self-organizing, decentralized storage system, i.e., PAST/POST and Glacier. Conflict resolution is a key part of such a system; for example, you may wish to treat simultaneous edits and network partitions as implicit branching events.

Some VCS links of note: Subversion (classic client/server system, useful as a featureset example); GNU arch; darcs (an interesting approach to dealing with causality and patch theory); monotone (some ideas, such as versions identified purely by hash, with natural mappings to DHTs).

Expo: Shared Whiteboard

Design, implement and evaluate a shared whiteboard facility on top a self-organizing, decentralized storage system, i.e., PAST/POST and Glacier.

Content-based pub/sub

Design, implement and evaluate a content-based publish/subscribe facility on top of a self-organizing, decentralized even notification system like Scribe.

Improvements to ePOST

Background: ePOST, a current project of the FreePastry group at Rice (and originally a semester project for a previous incarnation of this course), is a sophisticated peer-to-peer email architecture. Messages between ePOST users are encrypted and secure; node failure is tolerated with replication and extreme correlated failure is recoverable through massive aggregation. ePOST includes IMAP and SMTP services, so it interoperates with existing MUAs (email clients) as well as MTAs (Internet mail handlers).

ePOST is a successful application that can be made even more successful. Suggested improvements include:

Scribe Security

Design, implement and evaluate a seucrity system for Scribe. The SCRIBE publish/subscribe system currently lacks security. The goal of this project is to design, implement, and evaluate a security model for Scribe.

Students working on this project would need some background in security, so it would be a good idea if they had taken, or are concurrently taking COMP 527.

Something Completely Different

This is a wide-open project suggestion which appeals to your creativity. Invent, design, and implement a novel peer-to-peer application. The only stipulation is that there the application is decentralized and cooperative and that there is a clear potential for the application to prove useful to more than just a few people.