September 28, 2010
SpiderOak DIY: A space efficient key/value store for arbitrarily large values. Now in beta.
Update: SpiderOak DIY service has been discontinued, and is being replaced by the our new Nimbus.io storage service which is a new work based on everything we learned from DIY and our previous internal storage projects. It is also open source, with a fancy new ZeroMQ based architecture. Please visit nimbus.io for more information and to request an invite to use that service. The information below is provided for historical purposes only.
We alpha launched DIY a few months ago to allow SpiderOak customers to directly store data on the SpiderOak storage network via https. It’s similar to Amazon S3, but tuned for large backup and archival class data, and thus much less expensive. It’s also open source, on both the server and client side.
Today DIY is now in beta, and we’ve been using it ourselves to implement new features for some time.
Basically, if you’re already using S3 as a backup storage, switching to DIY will save you a great deal. You could also use the DIY code to run your own space efficient, redundant storage clusters for large data.
One of the things we’re pleased with is how comprehensible the DIY implementation is. It turns out that focusing on space efficiency and high throughput (instead of low latency for each request) allows a number of design simplifications compared to other scalable storage systems.
This is a project you can easily jump in and make progress in quickly. It’s built using zfec for parity striping, Python, gevent, and RabbitMQ, with a framework we created for quickly building small message oriented processes.
Feed back from users and developers is much appreciated.