wiki:Architecture

Version 28 (modified by jcnelson, 2 years ago) (diff)

--

Architecture

There are three principal components to Syndicate: the client, the metadata server, and the CDN. For the purposes of this document, "server" refers to a logical server (as opposed to a physical server). We use the word "host" to differentiate between the collection of programs that constitute a Syndicate server and the box in which it runs. As such, it is entirely possible that a single host may have all three components of Syndicate running on it.

CDN

Syndicate is CDN-agnostic, but we make a few assumptions about its capabilities:

  • All content replicas within the CDN are addressable by exactly one URL. There can be many cached replicas in the CDN, but Syndicate, like a web browser, should not be responsible for resolving a particular replica. Any redirection must occur within the CDN.
  • The CDN identifies a piece of content by its origin URL. More generally, there exists an injective function T that transforms the origin URL into a URL that identifies a replica within the CDN. For example, the file  http://cdimage.debian.org/debian-cd/6.0.3/amd64/iso-cd/debian-6.0.3-amd64-CD-1.iso is addressable in CoBlitz via  http://codeen.coblitz.org/cdimage.debian.org/debian-cd/6.0.3/amd64/iso-cd/debian-6.0.3-amd64-CD-1.iso.
  • The CDN evicts stale file data on its own. Syndicate is not required to tell the CDN to purge stale content, but it does not care about the policy either so long as fresh data gets cached.
  • The CDN honors a "do not cache'' hint by not caching files that Syndicate tags as such. This is required by Syndicate to prevent the CDN from caching stale or mutable data.

These assumptions are quite reasonable in practice--CDNs like CoBlitz, CoralCDN, and Amazon Cloudfront meet them, and even simple web proxies like Squid meet them as well.

Clients

The client is implemented as a  FUSE filesystem that a user mounts locally. It subscribes to a metadata server, from which it downloads and constructs the filesystem hierarchy. When an application opens and reads a file whose data is stored remotely, the client pulls the requested data into the CDN and streams it back to the application via the read() call. When an application creates a file and writes to it, its data is written locally to underlying storage, and when the file is closed metadata for that file will be uploaded to the metadata server to add it to the filesystem. Future read() operations on this file will be directed to local storage. When an application opens a file whose data is hosted remotely and writes to it, the client downloads the file to local storage from the CDN, performs the write() operation, and informs the metadata server that the file is now hosted locally when the file is closed.

Creates a directory is very similar to creating a file--the client first creates the directory on underlying storage and then uploads the metadata for the directory to the metadata server.

Any file or directory created on the client will be preserved on underlying storage, until it is removed by a local call to unlink() or rmdir(). In the event of a write-conflict where a remote file replaces a local file or a remote directory replaces a local directory, the underlying local data is preserved but the filesystem hierarchy the client presents will reflect the metadata server's hierarchy. This is because we do not want to introduce the possibility that a remote writer can destroy local data.

Periodically, the client polls the metadata server for metadata updates, which it then merges into the directory hierarchy. New files and directories discovered by the metadata server will become visible to the client, and files and directories that have been removed will disappear from the hierarchy (unless there are local, uncommitted changes), but not from underlying storage if they were locally created. If two clients both upload new metadata for the same file or directory, the metadata record with the latest last-modified time is committed. As such, we require the Syndicate client hosts and the metadata server to have loosely-synchronized clocks (NTPv4 daemons work well), and we have the metadata server discard new metadata that is too far ahead or behind the metadata server's host's clock.

The client runs an embedded HTTP server to serve file data to the CDN. Each time a file is locally modified and closed, the client generates a new URL for that file and uploads it (along with the rest of the file's metadata) to the metadata server. The URL is generated by creating a symbolic link from the file's data on underlying storage to the client's HTTP document root, and appending a version number to the basename. Once the metadata server successfully receives the new metadata for the file (including the new URL), the old symbolic links are removed. This process is called republishing a file. If the CDN requests a file that has been locally modified but not yet republished, the client replies with the data, but adds a no-cache header to its response. This is intended to allow a remote reader that has not yet seen the new metadata for the file to get at the data without polluting the CDN with soon-to-be-stale data.

Metadata Servers

There are two parts to a metadata server: a daemon that maintains the filesystem hierarchy, and a tool that allows metadata server administrators to manipulate the filesystem metadata manually. The filesystem metadata is kept on local storage in the form of an actual directory hierarchy, where each file is a stub that contains the metadata to be served to clients. This hierarchy is called the master copy.

File metadata

Metadata servers maintain the metadata of files published in Syndicate. A file's metadata entry includes the following data:

  • the path in the hierarchy
  • the URL
  • the size (in bytes)
  • permission bits
  • Syndicate UID of the file's creator
  • Volume ID of the metadata server that hosts the metadata
  • modification time
  • (optional) SHA-1 hash of the file

At this time, a file can either be a regular file or a directory (links, device nodes, sockets, pipes, etc. are not supported). Each file has exactly one path that leads to it, and it uniquely identifies the file.

mdserverd: The metadata server daemon

The mdserverd daemon is responsible for receiving metadata updates from clients, serving metadata to clients, and preserving the integrity of the master copy. Periodically, mdserverd will revalidate each file by querying its host for the data. If the data is no longer available, then its entry is removed from the master copy. During validation, entries that have timestamps newer than the UTC time observed by the metadata server (plus a clock skew) or do not have known timestamps are removed from the master copy.

The mdserverd daemon also enforces read/write access with its clients. When a client pulls metadata from mdserverd, mdserverd will only reply with metadata entries that the client is allowed to read. This is determined by the client's Syndicate UID and the permission bits of the files in the master copy. Also, when a client sends metadata updates to the mdserverd, mdserverd ensures that files not owned by the client's UID cannot have their ownership or permission bits changed, and files not write-accessible by the client's UID cannot be overwritten or modified.

The mdserverd daemon is not limited to maintaining the metadata for clients alone. The user may specify in the mdserverd configuration file which specific content URLs to be added as filesystem entries, as well as sitemaps of URLs, and even local directories. The user specifies a series of per-host and default URL rewrite rules using sed syntax to describe how to transform URLs into paths when creating metadata for them.

In the event that there are multiple metadata entries for a single path (e.g. two clients uploaded metadata for the same file, two URLs transform into the same path, a client uploads metadata for a non-client-hosted URL, etc.), mdserverd keeps the metadata entry with the latest timestamp.

The mdserverd daemon additionally maintains a URL blacklist, to which a metadata entry's URL is added in the event that either the entry's timestamp can no longer be determined, or the timestamp changed but the URL did not. The reason for this is to mitigate the risk that multiple, conflicting versions of the file are inadvertently pulled into the CDN by clients. Blacklisted URLs eventually are un-blacklisted after a user-specified timeout, which can be determined by how long the CDN is likely to keep a particular piece of content cached.

mdtool: The metadata tool

The mdtool program gives users on the metadata server host the ability to manipulate the master copy and perform a limited number of administrative operations on a running mdserverd instance. With mdtool, a user can:

  • add or remove master copy entries
  • generate a metadata textfile from a master copy
  • create a master copy from an existing metadata textfile
  • lock or unlock directories in a master copy
  • add or remove sitemaps to crawl from a running mdserverd instance
  • blacklist/un-blacklist URLs in a running mdserverd instance.

When a user wishes to lock a master copy directory, mdtool signals mdserverd to atomically snapshot the metadata represented in that directory in the master copy and write it to a lockfile in that directory (NOTE: locking a directory will NOT lock directories beneath it). Then, when a Syndicate client requests metadata from the server, the contents of the lockfile are used to represent the metadata in its directory instead of the master copy entries themselves. This way, users may manipulate the master copy and not risk having it read by a client while it is in an inconsistent state.

Attachments