Conversations about life & privacy in the digital age

Hey from QA & how we run sync testing in SpiderOak

Hi people of the internet (and mom!).

My name is Rebecca and I am a quality assurance tester with SpiderOak. This means that I test EVERY aspect of EVERY release on EVERY operating system — catching functional and style issues before the product goes live. I report issues to the developers, who then write a patch or some other sort of tech wizardry. Then, they send me the new builds to test again – this loop repeats until we create a product we’re excited to push live!

Sometimes, testing compatibility across different operating systems can get tricky – especially with syncing. A user can sync any two folders connected to a SpiderOak account, from any operating systems we support, and with any filetype exclusion. Testing this can get confusing, and worse – boring. So we came up with an idea that is fun and very efficient.

Here’s a glance at sync testing in SpiderOak!

First, I create uniquely-themed folders on each operating system in my Virtual Machine. Each folder must contain a variety of image and text files, and at least one subfolder. Pinterest and food blogs are my favorite sites for this. For example, my Windows 8 OS has a folder named “Cupcakes,” with images of cupcakes and some recipes and cookbook reviews, whereas my Ubuntu OS has a folder of cheeses and cheese/wine pairing notes. Each OS has a distinct theme, so I instantly know what files are coming from which location, without even having to track it in the “view” tab in the SpiderOak desktop client!

Second, I test the syncing within one operating system. I create a sync name and description (RecipeShare / sharing recipes for allergies), select two folders (“Cupcakes” and “Gluten-free cookies”), select wildcards to exclude (*.jpg, *.gif), approve it, and start the sync. With this particular sync, only the text files should sync across – if I see cupcake pictures in my my “Gluten-Free Cookies” folder, I’ll instantly know something is wrong. Also, folders that are synced cannot be in another sync (endless sync loop). So if I were to try to sync “Vegan Cookies” and “Gluten-Free Cookies” after the previous sync, an error message should appear.

Third, I test the syncing of folders from different operating systems. Both operating systems need to be running and set for the same – if one OS is set for yesterday, the sync will not complete (and you probably have bigger problems than a sync issue if you’re some sort of fancy time-traveller). I find this type of sync really useful for creatives – you can pull together inspirations and notes from your work, personal, and mobile devices, much more quickly than emailing attachments and texting reminders. I repeat the same steps as syncing within one operating system, and since each OS has a unique theme, I can instantly tell what files originated in which OS.

Finally, I repeat this on each OS to hunt down any anomalies. I also cancel syncs and then add files to one of the folders, to make sure the sync isn’t still active. If I cancel the above “RecipeShare” sync, and add a recipe for almond flour snickerdoodles to my “Gluten-Free Cookies” folder, it should no longer appear in the “Cupcakes” folder as well.

By creating special themes for each OS, I instantly remember where everything originates and ends up. Picking themes I personally enjoy and creating scenarios for why one would need folders synced in particular ways helps me understand the customer experience. This way I can also provide suggestions to make syncing more user-friendly and efficient! I, and the rest of SpiderOak, want to get you your data in the most clear and most secure way possible!

Themed syncs also allow for some silliness, so I’ll test your understanding of syncs with this:

What do you get when you combine a folder from your work computer about bathroom renovations, a folder from your home computer about Ancient Egypt, and a folder from your tablet of 90s hits?

Syncing your sinks with a sphynx and N*SYNC.

Happy Syncing!


Speeding up and running legacy test suites, part two

This is part two in a two part series on Test Driven Development at
SpiderOak. In
part one,
I discussed ways to decrease the time it takes to run a test suite. In part
two, I discuss two ways to run a test suite that are painful if the tests are
slow, but greatly beneficial if performed often with fast tests.

Once we have tests that run in milliseconds rather than minutes, we’ll want
to run them as often as possible. As I work, I’m constantly saving the current
file and running the tests, as is necessary when practicing test-driven
development. Rather than switching to a command prompt after each change in
order to run the tests, I just map a key in vim to do it automatically.

Whenever I start a programming session, I open the module I’m working on and
its corresponding test module in a vertical split in vim. SpiderOak has a few
runtime dependencies, and because we don’t use the system-provided Python
interpreter on Mac, I have to source a script to set up the runtime
environment. When running commands from vim, the environment is inherited, so
by sourcing the script before running vim, things work just as they would if
you invoke them from the command line directly.

$ (. /opt/so2.7/bin/; PYTHONPATH=some_path vim -O package/ package/test/

Once I’m in vim, I map a key to run the tests, modifying the mapping for
whatever module I happen to be working on.

:map ,t :w:!python -m package.test.test_module

This binds ,t to first write the file, then run python -m
. Of course, this will change depending on what
you’re working on and how you invoke your tests.

Running tests on a range of git commits

In my git workflow, I sometimes find myself staging changes piecemeal, or
rebasing, reordering, or squashing commits. These kinds of actions can lead to
commits with code in a state that hasn’t been tested. To make testing these
intermediate states easier, I have adapted

a script from Gary Bernhardt
to checkout each commit in a given range and
run a command on the result. Here’s my adapted version of the script:

set -e

ORIG_HEAD=$(git branch | grep '^*' | sed "s/^* //" | grep -v '^(no branch)' || true)

git rev-list --reverse $REV_SPEC | while read rev; do
    echo "Checking out: $(git log --oneline -1 $rev)"
    git checkout -q $rev
    find . -name "*.pyc" -exec rm {} ;
if [ $? -eq 0 ]; then
    [ -n $ORIG_HEAD ] && git checkout -q $ORIG_HEAD

This keeps track of the current HEAD, checks out each revision in the
provided range, and then runs whatever command follows the range on the command
line. If all goes well, it will check out the original HEAD, to leave you back
where you started. If at any point the command exits with an error code, the
process will stop, so you can fix the problem.

For example, to run the command python test/ on
every commit between origin/master and the current HEAD, you would

$ ./ origin/master.. python test/

Using the tools and techniques from this post and href="/blog/20121015153905-speeding-up-and-running-legacy-test-suites-part-one">part
one, I am able to run the SpiderOak tests quickly, after every change. This
enables me to use a TDD approach and not be slowed down by sluggish tests. With
the confidence that a comprehensive suite of tests provides, I can make
sweeping changes to parts of the SpiderOak code without worrying if I broke
something. Moreover, if I’m unsure of a solution, I can just try something and
see if it works. Because I’m not slowed down by the tests, trying an unproven
solution is rarely too large of an investment. Plus, there’s something
satisfying about making a large test suite pass in the blink of an eye.

Speeding up and running legacy test suites, part one

This is part one in a two part series on Test Driven Development at SpiderOak.
In part one, I discuss ways to decrease the time it takes to run a test suite.
In part two, I’ll discuss two ways to run a test suite that are painful if the
tests are slow, but greatly beneficial if performed often with fast tests.

As any experienced developer will likely say, the longer a test suite takes to
run, the less often it will be run. A test suite that is seldom run can be
worse than no test suite at all, as production code behavior diverges from that
of the tests, possibly leading to a test suite that lies to you about the
correctness of your code. A top priority, therefore, for any software
development team that believes testing is beneficial, should be to maintain
fast tests.

Over the years, SpiderOak has struggled with this. The reason, and I suspect
many test suites run slowly for similar reasons, is tests which claim to be
testing a “unit”, but actually end up running code from many parts of the
system. In the early days of SpiderOak we worked around some of the problem by
caching, saving/restoring state using test fixtures, etc. But a much better
approach, which we’re in the process of implementing, is to make unit tests
actually test small units rather than entire systems. During the
transition, we still have the existing heavy tests to fall back on, but for
day-to-day development, small unit tests profoundly increase productivity.

There are many techniques for keeping tests small and fast, and even more for
transitioning a legacy test suite. Each code base will ultimately require its
own tricks, but I will outline a few here that we’ve adopted at SpiderOak.


Mock objects are “stand-in” objects that replace parts of your code that are
expensive to set up or perform, such as encryption, network or disk access,
etc. Using mocks can greatly improve the running time of your tests. At
SpiderOak, we use Michael Foord’s excellent
Mock library.

One area where mocking has been particularly helpful in speeding up the legacy
tests in SpiderOak is by reducing startup time. In some cases, even if
individual tests run quickly, running the test suite can still take a long time
due to unnecessary startup costs, such as importing modules unrelated to the
code under test. To work around this, I often inject a fake module into
Python’s import system to avoid loading huge amounts of code orthogonal to what
I’m trying to test. As an example, at the top of a test module, you might see
the following:

import sys
from test.util import Bucket

# don't waste time importing the real things, since we're isolating anyway
sys.modules['foo'] = Bucket()
sys.modules[''] = sys.modules['foo'].bar

import baz

How it works

When you import a module in Python, the interpreter first looks for it in
sys.modules. This speeds up subsequent imports of a module that has already
been imported. We can also take advantage of this fact to prevent importing of
bloated modules altogether, by sticking a lightweight fake object in there,
which will get imported instead of the real code.

In the example above, foo is a bloated module that takes a long time to load,
and baz is the module under test. baz imports foo, so without this
workaround, the test would take a long time to load as it imports foo. Since
we’re writing isolated unit tests, using Mocks to replace things in foo, we
can skip importing foo for the tests altogether, saving time.

Bucket is a simple class that I use whenever I need an object on which I can
access an arbitrary path of attributes. This is perfect for fake package/module
structures, so I often use it for this purpose.

from collections import defaultdict

class Bucket(defaultdict):
    def __init__(self, *args, **kw):
        super(Bucket, self).__init__(Bucket, *args, **kw)
        self.__dict__ = self

This class allows you to access arbitrary attributes and get another Bucket
back. For example:

bucket = Bucket()
some_object =
assert type(some_object) == Bucket

A caveat: since Python imports packages and modules recursively, you need to insert each
part of the dotted path into sys.modules for this to work. As you can see, I
have done this for in the example from above.

sys.modules['foo'] = Bucket()
sys.modules[''] = sys.modules['foo'].bar

Ideally, using an isolated approach to TDD with Mock objects, your project
would never evolve into a state where importing modules takes a long time, but
when working with a legacy codebase, the above approach can sometimes help your
tests run faster, which means they’ll be run more often, during the transition.

Next, part two will outline two ways to run your tests regularly. After all, a
test suite is only useful when it is actually used.

Software Testing and the Nature of Reality

It ain’t so much the things we don’t know that get us into
It’s the things we know that just ain’t so. ~~Artemus
Ward (also attributed to Will Rogers and Josh Billings)

Ideally, software development expands the consciousness of the developer,
and ultimately of the user. Good software enables you to encompass aspects of
reality that were hitherto unavailable. (Bad software forces you into the
perceptual space of a hideous insect).

So testing software comes down to exploring the new facets of reality which
the software exposes. This is a path that every serious developer must follow
diligently. It’s not not enough to simply ‘throw it over the wall’ to QA.

This becomes a process of abandoning preconceptions. You must actually use
the software and accept the results, particularly when they are unexpected. So
you are testing himself as much as the software.

Test Driven Development is good, but not sufficient, because your
assumptions are built into the tests.