Still have questions? We have answers! Check out our Frequently Asked Questions for everything you need on SpiderOak
De-duplication is a process by which the differences are recorded and stored between an initial version and a subsequent version or versions of a file. This process makes it possible to store several versions of a single file without having to restore each version as a completely new file each time a change is made.
If you're saving multiple copies of the same file, only the original copy of the file will take up the full amount of space; all of the other copies will be a lot smaller because SpiderOak only saves the data that differs from your original file. For example, if you add more text or a graphic to a document, SpiderOak will only save the new data, instead of resaving the entire file. Also, if you back up a file on one computer that has already been backed up on another computer, this file will occupy no additional space in your account. SpiderOak uses deduplication to save our users space and, therefore, money.
When you are uploading a copy of a file which is already saved to our servers, SpiderOak performs deduplication before it ever begins the upload, comparing the files to the information you have already saved. It then uploads only the information that differs between the 2 files, such as their locations, in the form of “journal entries”. Although it appears that SpiderOak is reuploading the entire file, you’ll see that the upload goes much faster and takes up very little space because in fact only these journal entries are being uploaded.
SpiderOak only performs deduplication on files stored on your account, not across users. We explain in more detail here: https://blog.spideroak.com/20100827150530-why-spideroak-doesnt-de-duplicate-data-across-users-and-why-it-should-worry-you-if-we-did