We think of it as paying it forward.

With open hands we offer you open source libraries

In the process of building SpiderOak, we've created a collection of general tools that may be useful to other developers building other systems. We're happy to release them here as GPLv3 open source libraries.

Naturally, we request that users submit any patches, changes, bugfixes, comments, etc. back to us at code@spideroak.com so they can be included in the main distribution.

Valgrind + Python Integration

You can use this to see exactly which lines in your Python program are leaking memory. Standard Python itself usually doesn't leak memory, but it's helpful for debugging any C extensions you may be using, or developing.

Transactional Storage System

A python accessible filesystem API that supports most typical file operations (directories, folders, file creation, reading, writing, seeking, renaming, etc.) with transactional fault tolerance. You can open the filesystem, perform any number of modifications, and then commit or rollback all changes atomically. This lets us build SpiderOak using simple, traditional files objects as containers. Uses SQLite internally as a storage system, and thus keeps the entire filesystem within a single on-disk file. Is multiprocess and threadsafe. We used to have a working FUSE plugin for this, but it hasn't been maintained. If you're interested in writing one, we'll help.

Python Gimp Scripted Captcha Generation

We don't believe in security through obscurity, so we are making the code powering our own captchas freely available. This can work in one of two ways: as a FCGI/web-server embedded directly within GIMP, to generate captchas on demand, or in batch mode, to keep a folder filled with a cache of several thousands captchas in a cron job, for example. Includes throughly commented source code, making it easy to customize the output to your needs.

Multi-Process proxy for Pydispatch / Louie

Extends the normal Louie module to be capable of dispatching signals to other processes. Supports a variety of marshaling mechanisms, such as sockets, or disk files.


If you use Ganglia or other visual systems monitoring tools, you already know the benefit of seeing real-time graphs of your CPU use, memory, disk activity, network packets, and so on. StatGrabber gives you the ability to visualize the internal operation of your own software along side of all of your existing system stats. See real-time graphs of connected users, application transactions, revenue received, or any other quantitative values useful to you. It's often very helpful to see the transactions or events your own software is handling in real time relationship to system CPU, memory, etc.

Includes Perl and Python client libraries, and a collection daemon to accumulate and emit graph-able values once per minute. The client modules simply emit non-blocking UDP packets and get on with their business, avoiding slowing down their response time. You can graph 4 types of stats: counters (ex: transactions, revenue) averages (ex: size of transactions), accumulators (ex: bandwidth used) and elapsed time (ex: time per transaction)


A simple implementation of a prefix tree (trie). Tries have many uses; we use this to store the user's backup selection, for example. Tries are also useful when implementing completion lists, such as the auto-suggest feature of Google, or tab-completion in your shell.

jQuery Treeview

A Treeview widget implemented using jQuery. This takes JavaScript objects arranged as a tree and makes HTML list items out of them, using CSS to show indentation level and hide/show children when clicked. We use this for our ShareRoom and Web Storage interface. See tree.js for advanced usage.


Like Python's ZipFile module, except it works as a generator that provides the file in many small chunks. We use it for implementing recursive downloads of folders. We can generate and send a little bit of the file at a time, so the browser does not timeout waiting for a full zip archive to be created, and we don't have to allocate huge amounts of drive space for creating zip archives, since they can be created and streamed directly to the browser as they are written. It may also be useful in a asynchronous context where lengthy blocking is undesirable.