tabled - distributed key/value lookup table service
A key storage service of Project Hail, tabled provides an infinitely scalable, lexicographically sorted key/value lookup table. Keys cannot exceed 1024 bytes; values can be any size, including several gigabytes or more.
tabled user interface is HTTP REST, and is intended to be compatible with existing Amazon S3 clients.
tabled attempts compatibility with this Amazon S3 API (PDF), with some notable exceptions:
- We plan to add an APPEND operation, to atomically append data to an existing object. A byte offset is returned to the client upon successful completion.
- We do not limit object size to five gigabytes.
- We are open to supporting site-specific authentication such as Kerberos, in addition to the spec-dictated authentication scheme.
- Location support is not present. Eventually we would like to support custom location specifiers such as racks, buildings, GPS locations, and countries.
- BitTorrent support not present. It would be nice, but it is a low priority.
- SOAP support not present, nor planned. According to some forum posts, this represents a tiny subset of the user population. In our view, it is not a significant loss.
Beta. Data and metadata are successfully replicated, and recovery occurs after failure. Recovery is more time-consuming, and less immediate, than we would like.
Developers: browse the git repo, or check out from git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git
A wealth of projects large and small awaits interested contributors. Programmers will need to learn git, participate on the hail-devel mailing list, check out the source code, build and set up the project. Contributions from non-programmers are welcome as well -- documentation, feedback, and in general using the software.
Here are some suggestions for projects:
- document setup and system administration procedures
- scaling and testing
S3 API compliance
The following notes detail what tabled lacks, for full S3 API compliance. Unless otherwise noted -- such as with the SOAP API -- we would like to support these missing elements of the S3 API. As one can see from this list, tabled's API support is quite complete and usable.
Items are listed in priority order, with the most-needed items at the top.
- Range HTTP header (partial object retrieval)
- ACL support is limited. We support certain ACL grants (the canned access policies), but not the full suite.
- Server access logging
- Location constraints (US, EU, etc.)
- x-amz-request-id, x-amz-id-2 HTTP headers. Presumably we want to invent our own transaction ids, as x-tabled-XXX.
- SOAP support. No immediate intention to implement this at all. Forums seem to indicate the S3 SOAP API is only used by a tiny minority, compared to the well-known S3 REST API.
- BitTorrent support