wiki:Related Work

Version 3 (modified by jcnelson, 2 years ago) (diff)

--

Related Work

Protocols:

  • Gopher
    • RFC:  http://tools.ietf.org/html/rfc1436
    • “The Internet Gopher protocol is designed primarily to act as a distributed document delivery system. While documents (and services) reside on many servers, Gopher client software presents users with a hierarchy of items and directories much like a file system. In fact, the Gopher interface is designed to resemble a file system since a file system is a good model for locating documents and services.”
    • FS hierarchy maintained by server--the client requests the contents of a path, and the server replies. The client is not expected to recognize all types of FS entries (which can be files, directories, or other servers). The rationale behind the design is to offload as much “intelligence” as possible to the server, which was expected to be considerably more powerful than the client computer (e.g. this is pre-HTTP)
    • Servers will serve files that are held locally. Getting files from server B when the client is connected to server A is done by following the link on server A to server B, and then getting files from server B (the FS perspective on the client *will* change).
    • Not forced to be a rooted tree hierarchy--”cyclical paths” are allowed (e.g. /foo/bar/baz could be a link to /foo)
    • Single, globally-known gopher server for a community (analogous to metadata server)
    • Caching is allowed by the client, but not required
    • A gopher system may have dedicated search servers which crawl other gopher servers and build up indexes of their own.
  • BBS
    • More than just a file-serving protocol.
    • Single user (until the 90’s)--a user would dial into a BBS computer via a telephone line to use the remote BBS system.
    • Not only files, but also games, utilities, bulletins, etc.
    • BBS servers would only display and serve files that are locally stored; no standard linking.
    • BBS servers can reference, store, and forward data available on other BBS servers via FidoNet, which is a store-and-forward protocol used primarily for e-mail (but could be used for other data).
    • WRT file access, BBS’s were popular for sharing pirated software in the 80’s and early 90’s
    • Closer to webservers than distributed FSs
  • NNTP/Usenet
    • Newsgroups are named hierarchically (e.g. comp.os.linux)
    • An “entry” refers to an NNTP server, which hosts available files.
    • Text-encoded binaries available in the hierarchy (e.g. via uuencode, base64, etc.)
  • FTP
    • RFC:  http://tools.ietf.org/html/rfc959
    • Server replies local files only; no linking to other servers
    • Streaming and chunking supported
    • Out-of-band control-plane
    • server opens connection to client.



Systems:

  • CoBlitz
    • See NSDI’06 paper
    • Requests for large files are broken into byte-range requests by download agent, which are then forwarded to peers, which perform the requests, cache the result, and send back the data to the agent for re-assembly.
    • Unidirectional peers are determined by rank function (HRW); peer lists culled by minimum recorded heartbeat RTT.
    • Chunk reassembly uses “sliding-window” concept similar to TCP
    • Functions as a temporary distributed data store in between origin servers and clients; serves as the back-end to Syndicate
    • Chunks identified by hash of URL?
  • FUSE
    • Filesystems in Userspace
    • Kernel driver plugs into kernel VFS and forwards FS requests to user-space daemon, which performs the actual FS operations
  • WheelFS (from Mendeley Desktop notes)
    • CoralCDN has higher performance because WheelFS is built from more general-purpose components, whereas CoralCDN is more specialized
    • mail implementation has higher latency and serves less requests than a static mail system
  • NFS
    • See Wikipedia page?
  • SFS (from Mendeley Desktop notes)
    • distribute signed r/o database of i-nodes and blocks
    • i-node has handles to each file block
    • client-side crypto, relieving servers and replicas
    • replica servers are not trusted
    • directory blocks have hashes of file blocks--the FS employs a hash tree within the directory tree to verify the integrity of files
    • Rabin/Williams? public-key encryption for fast decrypt performance
    • database updates are recorded by clients--this way, a client can be challenged for the hash of a previous root directory i-node.
    • Clients are resistent to rollback attacks, since FS info given from the root has a timestamp and timeout to which a client can synchronize itself (and reject rollback'ed info)
    • Opaque directories supported
    • Database rebuilt on each i-node change
    • Clients securely determine whether or not a file handle exists, even in the event of a malicious server sending "handle not found" errors
    • Clients pull only changed bits of the database
    • clever use of caching and block signing improve performance
    • use-case: X.509 certificate FS
    • use-case: trusted software distribution
  • Shark (from Mendeley Desktop nodes)
    • read-mostly FS with central server managing each file
    • nodes read data and act as proxies--they'll cache file chunks and share with authenticated peers
    • peers aren't trusted--peers must authenticate
    • file leases for write support (similar to AFS)
    • chunks large files using Rabin fingerprinting
    • uses Coral as a back-end for locating peers (DSHT)
      • see Coral paper for peer group ideas
      • Coral used to find clients caching a replica, not the data
    • tested on EmuLab and PlanetLab
    • much faster than SFS
  • WebFS [about:blank (http://www.cs.washington.edu/homes/tom/pubs/webfs.html)]
    • hierarchy is based on URLs
    • Directory entries are created on reference--they do not exist in advance, unlike Syndicate
    • Write support
    • Per-file caching protocols
      • append-only (scalable, but no guarantee or ordering)
      • last writer wins (admittedly not very scalable)
        • Useful since simultaneous write sharing is rare in practice (according to “Mirjana Spasojevic and M. Satyanarayanan. A Usage Profile and Evaluation of a Wide-Area Distributed File System. In Proceedings of the USENIX Winter Technical Conference, 1994.”)
      • multicast update
    • Sample apps: internet chat, stock ticker
    • Does not appear to have been completed--there’s no experimental data or performance data.
    • Part of WebOS (appears defunct: http://www.cs.duke.edu/ari/issg/webos/)


Misc:

  • httpfs
    • FUSE module (http://httpfs.sourceforge.net/)
    • Read-only FS-like access to HTTP/1.1 server files
    • “glorified downloader”
    • no caching
    • no hierarchy? only single files?
  • Performance and Extension of User Based Filesystems