Conversations about life & privacy in the digital age

SpiderOak 4.8.4 release

We are sorry for the blog update delay, dear SpiderOakians.

As of a week ago, in 4.8.4: Fix a bug causing upgrade from one version of 4.8.x series to a newer one to fail on Windows.

See all release notes here.

Git, clients, partners, and woes.

This post comes from the “hindsight is twenty-twenty department.” A
few years ago when we started our
White Label program, we were
wondering how to manage the different client branding, GUI
customization, etc. Our first thought was “I know! We’ll use
git!”

Then we had two problems.

We used to use one branch in git per white label partner. The
intention was that we would then effortlessly merge across updates and
ship updates and have everything happy. Reality, however, quickly set
in. Every partner branch needed to be kept track of and manually
merged with HEAD. Being a distributed company, keeping git branch
discipline is bad enough when there’s one production branch, and much
worse when someone may commit last-minute bug fixes under the gun on a
completely different production branch. Once that happens, you need to
round up the branches and start merging back, up, down, and heaven
help you if you wind up with a mutually exclusive merge
conflict. Things wound up in a state where our white label clients
would lag months and months behind our production SpiderOak client at
best.

Our first attempt consolidated our generic white labels down to two
branches based on core GUI features included. This did a great job of
reducing complexity, but it still left many very interestingly
customized clients still in their own branches, and it left us needing
to make sure every branch was still lovingly merged and that fixes
accidentally committed to the wrong branch got brought everywhere
else. Something else was needed.

Our first step was to overhaul the builder, which we completed
recently. This gives us a flexible resource framework to drop in
everything from images to configuration files. The next step is to
boil down all our custom white label code into client code and builder
configuration files, which will again bring us back to one
production branch for everything we ship.

What does this mean for you, fair customer? The primary win is that
now, especially as we maintain
multiple
brands just under SpiderOak alone
these days, we will be able to work on features much more quickly and
deploy them to everyone. Bugs found by partners don’t get fixed only
for partners that report them, but for everyone. And finally, we get
only one, single place that we have to aim CI and testing tools. This
results in a far better SpiderOak experience.

And it sure results in a huge reduction in the amount of grey hairs
we accumulate with every release cycle. Our takeaway here at SpiderOak
is to really examine every new process we try to introduce for trying
to imagine even just a single year down the road. On the surface,
using git to manage different production releases of SpiderOak seemed
to be a splendid idea. After a couple of years? Worst. Idea. Ever.

HTML5 Mobile Client Open Development Project

I’m happy to announce that SpiderOak will be proceeding with its development of the new mobile client as an open development project. We are eager to arrange greater access to progress, as it proceeds, and to provide more opportunities for interested users to contribute, in various ways.

This means that we will be continuing our work on the new, HTML5-based client application in the open, including open sourcing the code base and also conducting our planning and coordination as openly and transparently as we can.

Project Process

SpiderOak will continue to lead development of the client. Members of our team, including the developer who has been working on the code to date (me), will be dedicated to the project. (I have had substantial involvement in open source development, at a few points in my career. It’s the way I prefer to work, so I’m particularly pleased with this turn of events.)

We will use a repository methodology, Fork & Pull, which is organized so that many people, both inside and outside an organization, can be involved and contribute.

Last week we ported the internal code repository to a publicly accessible github repository in the SpiderOak github organization, and started staking out the milestones/issues in the repository tracker. We also started to port the documentation to the repository wiki from its former home, the docs subdirectory of the code section.

Besides basic project orientation, the document providing orientation on the code architecture and other technical details has been ported to the wiki.

(That document also includes info about running the application from local files – currently necessary for testing it. Specifically, in order to run the application, you have to clone a local copy of the repository, using git, and then use a specially conditioned browser session to visit it. We’re working on providing a proxy by which anyone can try current versions of the development code just by pointing your browser at the right address.)

Plans

By the end of the year, we aim (milestone), to implement a single core application, with platform variants, that has equivalent functionality to the existing iOS and Android native mobile clients. It is being implemented in HTML5 / CSS / Javascript, with hybrid (PhoneGap) native extensions to fill in functionality gaps.

That is just the beginning.

HTML5 and, particularly, Javascript is becoming increasingly capable as
mobile platforms, along with the mobile platforms. After the initial
release milestone we plan to incorporate full ‘Zero-Knowledge’ operation,
including cryptographic encryption, local to the devices – like what
happens in the desktop client. As new SpiderOak secure collaboration
features emerge, we will implement them in this mobile client.

We are also excited about the possibility of using elements of the mobile HTML5 code as a common basis for desktop and browser HTML5 clients. The possible economies of sharing components between the desktop and mobile, plus the higher-level UI framing – HTML5/CSS/Javascript versus pyQt – may make it worth our while to re-engineer the desktop, and realize the benefits of greater development agility in the whole range, mobile to desktop, going forward.

Why Open Source/Free Software?

There are many reasons to conduct development of the HTML5 mobile client based on open source.

  • In general, we want to enable maximum access to SpiderOak services, including enabling others to use our code to inform their own efforts to use our services.

  • More, though, we want to arrange so that you, our users, can have thorough access, and not be in the dark about what is coming. You can help us understand what you need, and what we’re overlooking. You can contribute – help each other answer questions, fill in documentation gaps, identify problems and fix and devise code.

  • Some of the functionality we plan to implement will rely on innovations, like Javascript-based cryptography. Those innovations will be most useful to us, as well as to others, if they can be taken up and refined and strengthened by widespread use, beyond our projects. An open development process can help promote that kind of effect.

  • Ultimately, SpiderOak’s founders, and the team they have gathered in the company, have accumulated deep experience with and benefits from open source/free software. We see those benefits increasing, for us as well as for others, by applying open methodologies to development of this and other projects.

How Can You Get Involved?

Opening this project allows anyone to evaluate and contribute, not just code, but also designs, plans, and ideas that will be discussed online.

We are just starting this as an open development project, and will have some shaking out to do – as well as an end-of-the-year deadline that is first priority – but we are looking forward to shaping a good collaboration, with your help, and have started the steps to enable it.

Speeding up and running legacy test suites, part two

This is part two in a two part series on Test Driven Development at
SpiderOak. In
part one,
I discussed ways to decrease the time it takes to run a test suite. In part
two, I discuss two ways to run a test suite that are painful if the tests are
slow, but greatly beneficial if performed often with fast tests.

Once we have tests that run in milliseconds rather than minutes, we’ll want
to run them as often as possible. As I work, I’m constantly saving the current
file and running the tests, as is necessary when practicing test-driven
development. Rather than switching to a command prompt after each change in
order to run the tests, I just map a key in vim to do it automatically.

Whenever I start a programming session, I open the module I’m working on and
its corresponding test module in a vertical split in vim. SpiderOak has a few
runtime dependencies, and because we don’t use the system-provided Python
interpreter on Mac, I have to source a script to set up the runtime
environment. When running commands from vim, the environment is inherited, so
by sourcing the script before running vim, things work just as they would if
you invoke them from the command line directly.

$ (. /opt/so2.7/bin/env.sh; PYTHONPATH=some_path vim -O package/module.py package/test/test_module.py)

Once I’m in vim, I map a key to run the tests, modifying the mapping for
whatever module I happen to be working on.

:map ,t :w:!python -m package.test.test_module

This binds ,t to first write the file, then run python -m
package.test.test_module
. Of course, this will change depending on what
you’re working on and how you invoke your tests.

Running tests on a range of git commits

In my git workflow, I sometimes find myself staging changes piecemeal, or
rebasing, reordering, or squashing commits. These kinds of actions can lead to
commits with code in a state that hasn’t been tested. To make testing these
intermediate states easier, I have adapted

a script from Gary Bernhardt
to checkout each commit in a given range and
run a command on the result. Here’s my adapted version of the script:

#!/bin/bash
set -e

ORIG_HEAD=$(git branch | grep '^*' | sed "s/^* //" | grep -v '^(no branch)' || true)
REV_SPEC=$1
shift

git rev-list --reverse $REV_SPEC | while read rev; do
    echo "Checking out: $(git log --oneline -1 $rev)"
    git checkout -q $rev
    find . -name "*.pyc" -exec rm {} ;
    "$@"
done
if [ $? -eq 0 ]; then
    [ -n $ORIG_HEAD ] && git checkout -q $ORIG_HEAD
fi

This keeps track of the current HEAD, checks out each revision in the
provided range, and then runs whatever command follows the range on the command
line. If all goes well, it will check out the original HEAD, to leave you back
where you started. If at any point the command exits with an error code, the
process will stop, so you can fix the problem.

For example, to run the command python test/run_all_tests.py on
every commit between origin/master and the current HEAD, you would
run:

$ ./run_command_on_git_revisions.sh origin/master.. python test/run_all_tests.py

Using the tools and techniques from this post and href="/blog/20121015153905-speeding-up-and-running-legacy-test-suites-part-one">part
one, I am able to run the SpiderOak tests quickly, after every change. This
enables me to use a TDD approach and not be slowed down by sluggish tests. With
the confidence that a comprehensive suite of tests provides, I can make
sweeping changes to parts of the SpiderOak code without worrying if I broke
something. Moreover, if I’m unsure of a solution, I can just try something and
see if it works. Because I’m not slowed down by the tests, trying an unproven
solution is rarely too large of an investment. Plus, there’s something
satisfying about making a large test suite pass in the blink of an eye.

Speeding up and running legacy test suites, part one

This is part one in a two part series on Test Driven Development at SpiderOak.
In part one, I discuss ways to decrease the time it takes to run a test suite.
In part two, I’ll discuss two ways to run a test suite that are painful if the
tests are slow, but greatly beneficial if performed often with fast tests.

As any experienced developer will likely say, the longer a test suite takes to
run, the less often it will be run. A test suite that is seldom run can be
worse than no test suite at all, as production code behavior diverges from that
of the tests, possibly leading to a test suite that lies to you about the
correctness of your code. A top priority, therefore, for any software
development team that believes testing is beneficial, should be to maintain
fast tests.

Over the years, SpiderOak has struggled with this. The reason, and I suspect
many test suites run slowly for similar reasons, is tests which claim to be
testing a “unit”, but actually end up running code from many parts of the
system. In the early days of SpiderOak we worked around some of the problem by
caching, saving/restoring state using test fixtures, etc. But a much better
approach, which we’re in the process of implementing, is to make unit tests
actually test small units rather than entire systems. During the
transition, we still have the existing heavy tests to fall back on, but for
day-to-day development, small unit tests profoundly increase productivity.

There are many techniques for keeping tests small and fast, and even more for
transitioning a legacy test suite. Each code base will ultimately require its
own tricks, but I will outline a few here that we’ve adopted at SpiderOak.

Mocks

Mock objects are “stand-in” objects that replace parts of your code that are
expensive to set up or perform, such as encryption, network or disk access,
etc. Using mocks can greatly improve the running time of your tests. At
SpiderOak, we use Michael Foord’s excellent
Mock library.

One area where mocking has been particularly helpful in speeding up the legacy
tests in SpiderOak is by reducing startup time. In some cases, even if
individual tests run quickly, running the test suite can still take a long time
due to unnecessary startup costs, such as importing modules unrelated to the
code under test. To work around this, I often inject a fake module into
Python’s import system to avoid loading huge amounts of code orthogonal to what
I’m trying to test. As an example, at the top of a test module, you might see
the following:

import sys
from test.util import Bucket

# don't waste time importing the real things, since we're isolating anyway
sys.modules['foo'] = Bucket()
sys.modules['foo.bar'] = sys.modules['foo'].bar

import baz

How it works

When you import a module in Python, the interpreter first looks for it in
sys.modules. This speeds up subsequent imports of a module that has already
been imported. We can also take advantage of this fact to prevent importing of
bloated modules altogether, by sticking a lightweight fake object in there,
which will get imported instead of the real code.

In the example above, foo is a bloated module that takes a long time to load,
and baz is the module under test. baz imports foo, so without this
workaround, the test would take a long time to load as it imports foo. Since
we’re writing isolated unit tests, using Mocks to replace things in foo, we
can skip importing foo for the tests altogether, saving time.

Bucket is a simple class that I use whenever I need an object on which I can
access an arbitrary path of attributes. This is perfect for fake package/module
structures, so I often use it for this purpose.

from collections import defaultdict

class Bucket(defaultdict):
    def __init__(self, *args, **kw):
        super(Bucket, self).__init__(Bucket, *args, **kw)
        self.__dict__ = self

This class allows you to access arbitrary attributes and get another Bucket
back. For example:

bucket = Bucket()
some_object = bucket.some.path.to.some_object
assert type(some_object) == Bucket

A caveat: since Python imports packages and modules recursively, you need to insert each
part of the dotted path into sys.modules for this to work. As you can see, I
have done this for foo.bar in the example from above.

sys.modules['foo'] = Bucket()
sys.modules['foo.bar'] = sys.modules['foo'].bar

Ideally, using an isolated approach to TDD with Mock objects, your project
would never evolve into a state where importing modules takes a long time, but
when working with a legacy codebase, the above approach can sometimes help your
tests run faster, which means they’ll be run more often, during the transition.

Next, part two will outline two ways to run your tests regularly. After all, a
test suite is only useful when it is actually used.

Top 5 Reasons You Need SpiderOak Now

  1. That family picture you love will be safe forever. Back up the files that are important to you. Whether it is personal or professional, photos, music, movies, or documents, you’ll be glad you did. Your peace of mind is our priority.

  2. 100% Private. SpiderOak is for the privacy conscious. Only you can see your data – never our employees or the government. That is what sets us apart from other cloud providers. And that is what we mean by our “Zero-Knowledge” privacy standard. Your files are encrypted at the highest level. We do everything in our power so you feel safe with us.

  3. Cross-platform. Access your files anywhere, from any device. Windows, Mac OS X, and Linux (Ubuntu, Debian, Fedora & openSUSE) compatible.

  4. It’s easy. Once you sign up for a SpiderOak account, we will automatically sync with the files you choose. A few clicks, and we go to work for you, making sure we save the data you care about. Our friendly support team is always on standby to answer any questions you may have.

  5. Share files – safely. Even though all your files are encrypted, you can carefully and selectively share something from your account with the family, friends, colleagues, or clients of your choosing. All you have to do is create a ShareRoom and send the unique web URL to whoever you’d like.

If it sounds too good to be true, that’s because it is. We have your best in mind when it comes to life in the cloud, and privacy is our specialty. Give us a try with 2GB free for life. Get started now, and let us know what you think.

Want to learn more? Read our Engineering Matters page for the more nitty gritty technical details, and what makes us different from the competition.

GIMP – Quick Colour Correction

Some photos just need a little helping hand to ‘pop’ and really shine. Usually it’s to do with brightness and/or colouration. Here’s a quick and simple way to give your colours a boost.

Here’s my starting photo. It’s a bit ‘flat’:

To give this a quick fix, go to the GIMP menu and choose Colors > Curves. You’ll see this window:

What you want to do is make two points on the line and drag the right one up and the left one down:

This will give us an S-curve:

Which boosts the colours somewhat giving us:

And a final tweak would be to go to the menu and click Colors > Levels, and move the white triangle on the right in to the left. This will brighten the whites up giving us:

GIMP : Quick Tilt Shift Effect

We would like to introduce Ronnie Tucker – a prolific editor for Full Circle Magazine – who graciously agreed to contribute to the SpiderOak blog. Ronnie’s true passion is graphic design; as such, today he’ll talk a little about GIMP.

In the first of several GIMP posts, which I’ll be doing for SpiderOak, I’m going to show you some beginner to intermediate techniques. Things that people may think are only possible with Photoshop. If you’d like to read more about the absolute basics of GIMP then I’ll refer you to Full Circle magazine [FCM#12-19]. While I used an older version of GIMP in those issues the layout of GIMP has changed little in the passing years.

So, what is ‘tilt shift’? It’s the process whereby you take a photo and make it look like it’s actually a model. Some people can do it using cameras with tight depth of fields, but most apply a digital blur. Here’s how I do it.

Here’s the source image I’m going to use:

To achieve a tilt-shift, we need to apply a heavy blue to the foreground, and to the background. Enough to fool your brain into thinking that the photo can’t possibly be of something as large as an actual landscape.

First, click the rectangle select tool. Before doing anything else make sure you tick the box for ‘Feather Edges’ and move its slider up to about 50. If you don’t feather the edges you’ll get a sharp line where the blur ends and the unblurred image meet.

Now, left click on the far left, level with the road and then drag down to the bottom right of the picture:

Don’t worry if you over/undershoot the selection as you can move your pointer to the sides of the selection where you’ll see a bar, click and drag to resize your selection if you need to. The main thing here is to have the top of the selection just below where the road goes over the hill.

Now go to the menu and choose Filters > Blur > Gaussian Blur. In the window that shows, make the horizontal value about 10, and click OK.

If you can see a little chain link icon between the horizontal and vertical then they are linked so changing one should change the other. If not, click the broken chain link icon and it should link the two values.

Now we need to select the background. This may seem a bit of a daunting task, but we don’t need to be too specific here.

Click the ‘Free Select Tool’ and loosely draw around the outline of the treeline then out of the image around the edge of the image and back to where you started. If the start point and end points aren’t exactly the same press the Enter key and it’ll connect them automatically and complete the selection.

Again, back to Filters > Blur > Gaussian Blur and use a value of about 10. Click OK to apply the blur.

Click Select > None in the menu to see the resulting image.

Now that wasn’t difficult was it?

Play around with the blur values. Choose lower values and the tilt-shift effect won’t work as it doesn’t fool your brain. Higher values will work, but too much and it’ll spoil the effect.

Remember to back up your work! If you would like to read more of Ronnie’s work as well as ask him any question, he’d love to hear from you!

SpiderOak releases lightweight filesystem change notification utilities for Windows, OS X, and Linux (GPLv3)

We’ve decided to open source our “directory watcher” utilities.

These are tiny programs that ship as part of SpiderOak. They are written in C and use native OS specific APIs for obtaining file system change information and reporting it back to the main SpiderOak program in a standardized way.

They might be useful to anyone else who needs file system change notification on multiple platforms.

You can clone the git repos here:

How do YOU use SpiderOak?

At SpiderOak, we see it as our job to not only provide intense security and reliability but also innovation. And to better help us in our pursuit of building a more perfect product, we were wondering if you wouldn’t mind sharing information with us on how you use SpiderOak in your lives? All the smart and quirky ways you get the most out of our service.

If we like your tale of SpiderOak, we might even ask if we can use it on our website or in a promotion piece. If selected and as a way of saying thank you, we will be happy to provide you with several months of free service and/or additional storage space at no cost. (NOTE: Your name and personal information will never be used publicly without your expressed consent)

Just relay your story in the comments of this post or in the SpiderOak Forum