Conversations about life & privacy in the digital age

Spotlight on Sharing

Today, we would like to turn your attention to sharing. Below is a brief introduction on sharing and how it works from our co-founder and CEO, Ethan Oberman.

As Ethan describes, you can carefully and selectively allow portions of your SpiderOak Network to be shared (or become public) to family, friends, colleagues, or clients. You can create any number of password protected ShareRooms and share data aggregated over several machines (a folder from your Mac and another from your Windows machine). Furthermore, the data within a ShareRoom is automatically updated when changes occur eliminating the need to ever resend content. A ShareRoom may be accessed as a unique web URL or by entering a user’s ShareID and RoomKey on the SpiderOak homepage – easily allowing people you invite to view your documents, pictures, movies, and so on. In addition, each ShareRoom has a unique RSS feed to alert guests when new content is available.

This last video demonstrates just how easy it is to create a ShareRoom. Happy sharing!

The Product is YOU –> YOU are the Product

The other day I heard a question that so wonderfully placed into perspective an ongoing debate I have been having with friends and colleagues:

Are you creating a profile on Facebook? Or is Facebook creating a profile on you?

Or put a different way:

If you’re not paying for the product then you are the product.

Inherent in these questions is an understanding of how Facebook derives revenue from its hundreds of millions of non-paying users. It is a question that must not be ignored. Why? This is much deeper than being solely about Facebook; this issue touches on how we – as a society – think about our privacy and the social contract we make with social media companies who are constantly collecting data on who we are, what we do, and with whom we interact. Yes – that is a lot of information to know about someone and if a company collects, repackages, and sells this information in the open market then I think the user should at least be made aware of the process in a simple and straightforward manner.

Now – do companies like Facebook have to give users a choice? Do they have to offer two options on how revenue is generated? By way of example – offer ‘A’ states the user has to pay to keep data private while offer ‘B’ provides a free service but allows the company to repackage and sell user data. It is of course not appropriate to force any company to act in this regard; however, if a social media company dared to adopt this approach they could draw greater attention to how data is being used and provide a more meaningful way for people to understand what they are getting into while enjoying the online, trackable, measurable social world in which we live.

On a related note, we at SpiderOak engage in the ‘freemium’ model. We provide a set amount of space for free so that users can enjoy and become comfortable with the product. Once additional space requirements are necessary, the user can purchase additional space. Inherent in this model – from the SpiderOak perspective – is that we never ever monetize free user accounts by way of advertising or any other means. We see this as having a tremendously negative impact on the freemium model – especially when in the business of ensuring privacy.

As always, we are eager to hear from you so please don’t hesitate to send thoughts and comments.

New Browser-Based Signup Process & Maintaining ‘Zero-Knowledge’ Privacy

One of the things that has always made SpiderOak unique is our ‘Zero-Knowledge’ privacy policy. ‘Zero-Knowledge’ means no one at SpiderOak has the ability to access your data – ever. Even if we wanted to access your data or received a subpoena to do so we could never turn over plaintext data. This is accomplished by encrypting all data on your machine before it is sent to us, using encryption keys generated from your password.

With this new version of SpiderOak, we are changing our signup process to include password creation in the browser. But how can we do this and ensure ‘Zero-Knowledge’ privacy? Isn’t creating a password on the web (via a browser) in clear violation of how we maintain our security?

The short answer is that we hash your password before sending it to our servers. A hash is a one-way algorithm such that there is no way for us to reverse the hash and figure out your password. When you try to login for the first time, we hash your password again in the client and compare it to the hash stored in our servers. If the two match we know that you entered the correct password. We use a javascript implementation of bcrypt to do the hashing. This gives the convenience of a simplified signup process while maintaining your privacy. And if you don’t trust this process, we encourage you to disable javascript during signup and you will be not be prompted to create a password in the browser.

Now to focus on our motivations for making this change. We used to have everyone signup in the SpiderOak application which was great from a security perspective; however this process was awkward for customers who are used to signing up for services on a website instead of downloading an application first. It also didn’t work well with tracking behaviors – most notably our Refer-A-Friend program. Previously, when someone followed a Refer-A-Friend link to our website we had no way to know when they signed up in the application. We had a system that was pretty good at guessing after-the-fact but it was slow and often missed signups. It could take up to several weeks to get credit and sometimes the user wouldn’t get credit at all.

We needed a better solution so we conceived a way to move a portion of the signup process to the web. Since password creation was still handled in the application, we needed a way for the user to identify him/herself when the application launched on their computer for the first time (otherwise anyone could steal the account before a password was created). We accomplished this connection through generating activation codes. This system solved the Refer-A-Friend problem but activation codes proved to be a bit clunky. People would lose them or not understand what they were for.

That brings us to today. The goal of any signup process is to make it as easy and seamless for the user as possible. In our case, we also always have to keep in mind our user’s privacy which adds to the complication. With this new process in place and thanks to bcrypt, we have a much simplified process while maintaining our important ‘Zero-Knowledge’ privacy.

In the end, privacy isn’t just something we seek for additional challenge but rather a philosophical approach we believe in deeply; we have never been willing to abandon it for convenience. That said, we are always looking for ways to provide our high level of security in simpler and more usable ways. I believe that this change accomplishes our goals.

SpiderOak’s new Amazon S3 alternative is half the cost and open source

As 37signals famously described, in the software business we almost always create valuable byproducts. To build a privacy-respecting backup and sync service that was affordable, we also had to build a world class long term archival storage system.

We had to do it. Most companies in the online backup space (including BackBlaze, Carbonite, Mozy, and SpiderOak to name a few) have made substantial investments in creating an internal system to cost effectively store data at massive scale. Those who haven’t such as Dropbox and JungleDisk are not price competitive per GB and put their efforts into competing on other factors.

Long term archival data is different than everyday data. It’s created in bulk, generally ignored for weeks or months with only small additions and accesses, and restored in bulk (and then often in a hurried panic!)

This access pattern means that a storage system for backup data ought to be designed differently than a storage system for general data. Designed for this purpose, reliable long term archival storage can be delivered at dramatically lower prices.

Unfortunately, the storage hardware industry does not offer great off-the-shelf solutions for reliable long term archival data storage. For example, if you consider NAS, SAN and RAID offerings across the spectrum of storage vendors, they are not appropriate for one or both of these reasons:

  1. Unreliable: They do not protect against whole machine failure. If you have enough data on enough RAID volumes, over time you will lose a few of them. RAID failures happen every day.
  2. Expensive: Pricy hardware and high power consumption. This is because you are paying for low-latency performance that does not matter in the archival data world.

Of course #1 is solvable by making #2 worse. This is the approach of existing general purpose redundant distributed storage systems. All offer excellent reliability and performance but require overpaying for hardware. Examples include GlusterFS, Linux DRBD, MogileFS, and more recently Riak+Luwak. All of these systems replicate data to multiple whole machines making the combined cluster tolerant of machine failure at the cost of 3x or 4x overhead. Nimbus.IO takes a different approach using parity striping instead of replication, for only 1.25x overhead.

Customers purchasing long term storage don’t typically notice or care about the difference between a transfer starting in 0.006 seconds or 0.6 seconds. That’s two orders of magnitude of latency. Customers care greatly about throughput (megabyte per second of transfer speed) but latency (how long until the first byte begins moving) is not relevant the way it is if you’re serving images on a website.

Meanwhile the added cost to support those two orders of magnitude of latency performance is huge. It impacts all three of the major cost components – bandwidth, hardware, and power consumption.

A service designed specifically for bulk, long-term, high-throughput storage is easily less than half the cost to provide.

Since launching SpiderOak in 2007, we’ve rewritten the storage backend software four times and gone through five different major hardware revisions for the nodes in our storage clusters. Nimbus.IO is a new software architecture leveraging everything we’ve learned so far.

The Nimbus.IO online service is noteworthy in that the backend hardware and software is also open source, making it possible for people to either purchase storage from Nimbus.IO similar to S3, or run storage clusters locally on site.

If you are currently using or planning to adopt cloud storage, we hope you will give Nimbus.IO some consideration. Chances are we can eliminate 2/3 of your monthly bill.

An Ode to GoPro

What is a GoPro? It is a camera you can wear. It is a camera you can mount. A GoPro can be used for videos as well as pictures. It is a tiny camera with its own underwater housing that can fit into almost anything and capture any type of footage. This camera has even been able to record footage inside an animal’s mouth such as a shark. The camera’s main purpose is to capture those hard-to-get action shots and at angles a normal camera cannot be placed. For example, many surfers mount this camera on their surfboards. As they surf, the camera captures the route of the surfer as they ride inside the barrel of a wave.

I have yet to own this camera myself. For a camera that can do so much it is rather cheap, but alas, I am a kid fresh out of college with very little money. Fortunately I found out that GoPro’s Facebook page allows fans to try and win a free GoPro in addition to an extra accessory. Such accessories include a harness, wrist housing, or a helmet mount. Everyday, anyone over the age of 13 is eligible for a chance to win one of these cameras. Being a pretty big fan, I enter into their sweepstakes every morning.

If I did have a GoPro in my possession I would use it to film underwater. I’ve always wanted to film some underwater footage regardless of what it is. I am moving to Austin, Texas, and I hear that they have a zipline somewhere on the outskirts of the city. I would also use a GoPro to film what I see as I’d fly above the trees. Maybe one day I will get lucky enough to win one of these cameras or save enough money to actually purchase one. We’ll see.

2-Factor Authentication to your SpiderOak Account

We are now offering limited support for 2-Factor Authentication into your SpiderOak account.

2-Factor Authentication provides an additional layer of security on top of password protection. In other words, if someone were to compromise your username and password, these two elements alone would not be enough to allow them to access your SpiderOak account.

As a first step, we are offering this new feature to paid users only who have phone numbers located inside either the US or Canada. Given that a high percentage of SpiderOak customers (and several SpiderOak team members) live outside North America, we will soon eliminate this restriction.

To enable 2-Factor authentication for your account, you may either login to or navigate to the SpiderOak application — > Account –> Credit Card / Billing Information section. You will then notice a new option labeled ’2-Factor Authentication’.

Once enabled, any time you login to your SpiderOak account via the web or a mobile device, you will need to provide your current username, password, AND a ‘token’. The ‘token’ will be sent to your mobile device and should be entered directly after your password with no spaces or marks between them. For example, if your password is ‘red’ and the token reads ’1234′ then you would simply enter ‘red1234′.

Each 2-Factor Authentication token you receive is good for 12 hours and can be created here: Token Request. The text message you receive will look similar to the below:

SpiderOak Secure Login Token: 01234567
This code is good for 12 hours. If this login
code was unexpected, email

You can only request one token every twelve (12) hours. If you try to request a token more frequently than twelve hours, subsequent attempts will silently fail. If two factor authentication is enabled for your account, any login attempt that does not include a current token will also fail (similar to entering an invalid password or a non-existent username).

Please Note: This is an optional feature that has to be manually enabled by the user. If 2-Factor Authentication is not enabled, the login procedures will remain unchanged – continuing with a password-only based login.

For the first days of this trial-program, 2-Factor Authentication will only protect web based logins. Over the course of the next several days, we will be extending this feature globally and anywhere you have to authenticate to SpiderOak (e.g. activating new devices and/or reinstalling existing devices).

Finally and as a reminder – even with two factor authentication, the usual recommendation still applies, and accessing your data via the desktop client is more secure than the web and/or through mobile devices.

For those curious about how 2-Factor Authentication is implemented, we are working with the excellent Twilio telephony API to deliver the SMS messages. It costs SpiderOak $0.01 per SMS token which we believe to be more than reasonable and money well spent.

Depending on the interest and adoption, we may extend this to Android OATH tokens, Yubikeys, or other various secondary security factors. Please feel free to give feedback on what additional methods you’d like to see and/or the arrangement in general. We are obviously in the early phases now but excited to be adding this additional security layer for those security conscious folks among us.

Software Testing and the Nature of Reality

It ain’t so much the things we don’t know that get us into
It’s the things we know that just ain’t so. ~~Artemus
Ward (also attributed to Will Rogers and Josh Billings)

Ideally, software development expands the consciousness of the developer,
and ultimately of the user. Good software enables you to encompass aspects of
reality that were hitherto unavailable. (Bad software forces you into the
perceptual space of a hideous insect).

So testing software comes down to exploring the new facets of reality which
the software exposes. This is a path that every serious developer must follow
diligently. It’s not not enough to simply ‘throw it over the wall’ to QA.

This becomes a process of abandoning preconceptions. You must actually use
the software and accept the results, particularly when they are unexpected. So
you are testing himself as much as the software.

Test Driven Development is good, but not sufficient, because your
assumptions are built into the tests.