Conversations about life & privacy in the digital age

SpiderOak: Blue for Enterprise

Imagine yourself the CIO of a major company, walking down the street and thinking (as CIOs tend to), “gosh, I love SpiderOak, but it’s just too awkward to use across my company!”

At this point, I teleport in. “But wait!” I exclaim. “We’re now working on a solution just for you and your business! SpiderOak Blue!”

More seriously, we understand centralized management and provisioning will make or break a product geared for the business market; after all, who has time (or the money to pay for the time) to go and individually administer each unique SpiderOak user account? What if Bob and his laptop both wind up under the bus? How will you get Bob’s work data back?

Here at SpiderOak Business Labs, we’ve looked at the problem from the perspective of data ownership. Our consumer oriented product places ownership in the hands of the end-user. The user is the only one with the keys to unlock and look at plaintext data. And whereas this is the perfect scenario in the consumer world, it breaks down in a business setting where ownership of the data belongs to the company. And even still, perhaps you are a university wanting to purchase accounts for your student body in bulk. Of course you want the ability to maintain the accounts but also have no visibility into the plaintext data itself. We have worked hard and been successful in suitably covering all of the above cases and more.

So how does all this work? How does SpiderOak allow companies to retain ownership of the data while never having plaintext visibility? To answer that question, we turn to our ‘Zero-Knowledge’ privacy policy and encryption methods – all of which make this an interesting system to support. We’ve developed two distinct methods – both of which keep SpiderOak ‘Zero-Knowledge’ while letting the organization retain full-knowledge.

If you then add the complexity and pain around managing both individual user accounts as well as companywide deployments, you add yet another layer. So – we have created a system where administrators gain full control of SpiderOak from one central location.

With no further ado, dear and humble reader, may I introduce to you…

SpiderOak: Blue

What does this get you? Let’s take a look at the feature list.

Base Features:

  • The same capabilities and meaningful privacy guarantees as our consumer product (‘Zero-Knowledge’ privacy standard)
  • Central, easy-to-use web-based management console allowing user provisioning, group permissions, space management, and user reporting
  • Selective enable/disable of web and mobile access to SpiderOak accounts.
  • Bulk creation and management of user accounts (along with editing and downloading via CSV)
  • Detailed reports on user activity and problems across your deployment
  • You only buy space, and divide it among your users as you see fit. No silly per-user or per-device fees, or charging you extra to backup a server. Plain and simple pricing.
  • Ability to follow policy-set permissions in the Windows Registry (on Windows), or as a text file in /etc (Linux) or /Library (Mac)
  • Easy-to-deploy MSI installers for 32-bit and 64-bit Windows

SpiderOak: Blue OpenLicense

This product is based on our current OpenLicense program. In fact, a large part of Blue came from addressing limitations in the current OL program.

  • Data Ownership Model: The end user, not the organization. A user who forgets their password needs a new account.

SpiderOak: Blue

This is our ‘standard’ tier of Blue service.

  • Data Ownership Model: The organization.
  • Password resets possible via browser-driven ‘Zero-Knowledge’ encryption in the management interface.
  • Non-’Zero-Knowledge’ user data auditing interface

SpiderOak: Blue Plus

This is the top-shelf enterprise-grade SpiderOak, for those with ultimate management needs. Everything that follows here is made possible by our Blue Virtual Appliance, which puts all management control into an open-source virtual machine running on your infrastructure. You get full control over the data flowing into and out of SpiderOak from your organization, while we stay completely ‘Zero-Knowledge’.

  • You host your organization’s private keys. Key escrow lets you have full and complete control over the data by enabling you to hold onto the master private keys (which are normally generated via a key derivation scheme based on the user’s password).
  • User account integration with Microsoft Active Directory, OpenLDAP, and RedHat Directory Server. Define LDAP groups, point the appliance at them, and those users automatically show up on SpiderOak.
  • Integrated password management via LDAP or RADIUS. Due to use of key escrow technology, passwords for Blue Plus are only for authenticating users. Via the magic of the virtual appliance we can authenticate against your organization’s existing authentication infrastructure. SecurID? No problem!
  • ‘Zero-Knowledge’ (to us) web and mobile access. This VM can also host a local copy of the web access portal – providing on-the-go access to your users while we remain ‘Zero-Knowledge’.
  • Through the magic of the above web access, the user auditing / administrative data restoration console is also Zero-Knowledge from our perspective.

Interested?

SpiderOak Blue is now available through a limited release. We have been working with several large enterprises through the beta period and will continue towards general release. If you’re curious about the product, please send an email to blueinfo@spideroak.com and we will get back to you soon.

Introducing SpiderOak Open License – Just in time for School

Over the last several months we have received requests for a version of SpiderOak that could work in compliment to the institutional setting – schools, colleges, universities, research centers, and the like. As the fall is now upon us, we thought it would be a good time to introduce our latest product that meets the specific demands of such institutions – SpiderOak Open License.

The Open License program allows an institution to purchase SpiderOak accounts in bulk and – of course – at a discount. A central admin can create one-time codes to be distributed among the associated users (e.g. students, faculty, school personnel, team members, etc…) and each member will gain access to a 100 GB incremental account. Accounts may also be closed by a central admin if the user ends up either graduating or moving on for various other reasons and this license then becomes available again for use. It is important to note that our ‘Zero-Knowledge’ privacy policy remains fully intact with each account; therefore, the only person that has access to the actual data being backed up, synced, shared, stored, or accessed belongs to the individual user and not the institution.

Further, for larger organizations that are managing hundreds or thousands of accounts, we have created APIs into the SpiderOak Open License program to make it easier to manage the overall process – tying more directly in with your current systems and protocols.

For more information, please feel free to visit the SpiderOak Open License website and don’t hesitate to contact us anytime at SpiderOak Open License Support.

Android and… THE FUTURE!

In the mobile world here at SpiderOak HQ, there’s two things that have some code being laid down already, and they’re interrelated.

The first is that an Android app is now being actively worked on. Yes, I know we’re late to the party. If it’s “fashionably late”, or “better late than never”, or “about darn time”, I don’t know, but it’s being worked on. When it’s complete, it’s going to be open-source from the get-go so that those with open handsets can play with the app as they please. This is cool, because it’s going to also act as a demo implementation of our Next Big Thing. This app will provide at least the same amount of functionality as the 1.1 version of the iPhone app.

The Next Big Thing

There’s a lot of things you can do with a bunch of cloud storage. They’re all very, very cool. We have very cool web-based APIs, but I know there are difficulties in using our storage for cool uses (like Documents to Go) straightaway because of our ‘Zero-Knowledge’ encryption system.

The solution that I’m working on is a connection library in Objective-C and Java that will use our public APIs to provide high-level operations for working with files in SpiderOak storage, both personal storage and ShareRooms. This library will also be open-sourced as to make it easier to drop into your own projects, or see how we use our own APIs so that you can adapt the use to your own ends.

Single File Sharing

A few months ago we posted a blog listing some new features that we planned of implementing. Among them was the ability to share just a single file – a feature we had not previously included with the sharing functionality. A good many responses came in suggesting that the ability to share one file at a time would be a welcome feature and certainly be easier than having to create a folder with a single file inside (as our ShareRooms only operate at the folder level).

I am very happy to announce that we have launched our single file sharing capability. Well – to be fair – we pushed this feature a release or two ago but I wanted to make sure I mentioned it in the blog now that it is fully functional and all of the kinks are ironed out. As it so happens, the planning for single file sharing emerged from our recent iPhone development as it integrated the ability to send a link to any file stored in your SpiderOak Network from your phone.

To access this feature in the application, you will find a new menu button on the View tab labeled ‘Link’. When you highlight a file on the View tab, the ‘Link’ button will become active and pressing it will generate a URL to that particular file. You can then cut & paste that URL into an email or other message and send it to friends, family, colleagues, or clients. And similar to our ShareRooms, only that one file will be exposed as the rest of your data remains secure in our ‘Zero-Knowledge’ privacy environment. Further details are available here: release notes for 3.6.9643.

As a last mention, we are now working on our OS integration such that SpiderOak will be embedded inside the Finder/Explorer window on Mac OS X and Windows respectively. In addition to being able to see which files are among the backup set and eventually a status indication, through the contextual menu you will be able to both select additional folders/files for backup and enact the single file sharing feature such that there will be no need to interact with the SpiderOak application.

Please don’t hesitate to send further thoughts and/or ideas on features as we do greatly enjoy hearing your feedback and it is crucial in our efforts to best serve you.

Building a Server

Most companies pay a lot of money to have have third-parties build
and maintain their storage infrastructure, often at an enormous markup
beyond the cost of the hardware. At SpiderOak, they do things a little
bit… differently. They have me.

A big part of my job is following the industry and researching and
testing new ways to build storage servers, and if I do say so myself,
I’m no slouch at building systems. I’ve designed and administered a
small computational cluster and built my fair share of desktops and
servers. So if everything worked perfectly, building SpiderOak machines
would be a doddle.

But they don’t, and it’s not. A few recent examples:

Since our servers are locked away in a data center, we have an array
of remote monitoring and access controls for our machines, core among
which is the BMC, or Baseboard Management Controller. Its job is to
allow all the things you could do with a computer physically, like turn
it off and on again or look at the console display, over a network. With
one of these in your computer, you can install, configure, run, and
destroy your computer from anywhere in the world. It’s a fantastically
convenient piece of kit when it works properly.

The particular BMC we have is made by a company who shall remain
anonymous, but we’ll call them MuperSicro. Now, MuperSicro’s BMC is
designed to share an Ethernet port so that it doesn’t need a dedicated
port. It does still have its own MAC address, though. Or it
should. This particular unit came to me with a MAC address of
00:00:00:00:00:00. Their solution? “Take the MAC address of LAN2 and add
1.” That works, but I would like it if parts came properly configured. I
ordered a BMC, not a Heathkit for one.

More fun comes from our LSI 8888ELP controller. This is a fantastic
SAS RAID controller — internal and external ports, 512MB of cache,
and excellent OS compatibility. The configuration, though, is a bit
daffy. For their BIOS configuration, you have a choice. You can use
WebBIOS, which poorly imitates a webpage, uses a mouse, and it just
about the worst choice of interface for RAID configuration imaginable.
Alternatively, you can use Preboot CLI, which is MegaCLI in standalone
firmware form. The deficiencies of MegaCLI have been href="http://www.kaltenbrunner.cc/blog/index.php?/archives/4-LSIlogic-MegaRAID-SAS-and-the-self-explaining-CLI.html">adequately
discussed by others, and I can say as a man who uses both mplayer
and ffmpeg frequently, it is bar none the most hideous and inconsistent,
poorly-documented piece of crap command-line tool I have ever used.

Without proper documentation, I willy-nilly decided to enable
DirectPdMapping, figuring that it would allow me to get direct access to
the drives. That particular option is, I might add, not documented in
the MegaRaid SAS User’s Guide available on LSI’s website. It said I
needed to reboot, so I did, and I was greeted with this:

Attached Enclosure doesn't support in controller's Direct mapping mode
Please contact your system support.
System has halted due to unsupported configuration.

The controller decides that it can’t use the enclosure in the way I
asked, so it halts the system. It doesn’t offer to turn direct mapping
off (which would be nice) or offer to load the configuration tool (which
would be expected), it just halts. The solution is to open the case and
unplug the SAS cable to the enclosure, which then allows the machine to
boot so that you can change it, then plug it back in and continue on
your way. If this machine was in a rack in the datacenter when this
happened, I’d have to bother the techs to go open it up and fiddle with
it. It allowed me to get into a situation that was unrecoverable without
physical intervention. That is completely unacceptable. Oh, and neither
WebBIOS or Preboot CLI can help you turn that feature off, either. I had
to boot into Linux to switch it back.

Invariably you will find foibles like these when building a new
system, which is why we spend time poking and prodding newly built
systems before putting them in our data centers and entrusting them with
your data.

What does i_ m__n __ __v_r _____ ___ ____ ____ ___c_?

We have been getting a lot of questions lately about our block level
de-duplication, how it works, and how it is applied through the SpiderOak
process. As I consider myself to be layman, please allow me to explain this in
more simplistic terms – such that even I will be able to understand.

For the sake of this example, let us say you have created a document
entitled ‘Why peanut butter and jelly sandwiches are better when you place
salt & vinegar chips in the middle’. The size of this document is 10k.
After saving the initial version, you go back and make 9 additional edits.
Each time you make an edit, you save the document as a new version thus giving
you 10 complete versions. And with each version being exactly 10k, the
complete document takes up a total of 100k on disk (or 10 versions multiplied
by 10k).

SpiderOak, on the other hand, works much more efficiently when storing data
- creating many wonderful benefits for the user. As you can imagine, from the
first version of ‘Why peanut butter and jelly sandwiches are better when you
place salt & vinegar chips in the middle’ to the last, only small pieces
of the document have changed. One simple example is replacing the word
‘excitable’ with the word ‘volatile’ in the third paragraph. Instead of
storing (and uploading) a whole new version of the document each time a small
change is made, SpiderOak breaks each document into blocks of data and then
only backs up (or uploads) the change or delta between the new version and the
old. Using this process, the same 10 versions of the aforementioned document
on SpiderOak only amounts to 15k on disk (as opposed to 100k above).

Although the below visual example only uses two versions of a document, it
does further explain how the SpiderOak de-duplication process occurs.

This process saves our users a considerable amount of space as a user is
only billed for the de-duplicated amount. Furthermore, the upload can occur
with much greater speed because only the changed blocks of data are sent from
one version to the next. In the end, SpiderOak works extraordinarily hard to
never upload and/or store the same block of data twice – saving our users
money and time.

Question: So perhaps now you may better understand the title and how it
relates to de-duplication?

Answer: What does it mean to never store the same data twice?