I ran into this article – How I Reverse-Engineered Google Docs – and even though it’s quite an interesting piece, and the rationale behind it is of great teaching value in-and-of itself, I wanted to take some time to reflect about another perspective of this article’s findings.


This is not a statement on what SpiderOak as a company thinks about Google. Rather, just one of the countless examples of how privacy might be in jeopardy without the user’s knowledge.


If you don’t want to read the article above, the gist of it is this: Google, through its Google Docs application, is tracking and saving every key stroke somebody makes.

While somebody might argue this is an obvious thing because it supports Undo and Redo, it could also easily support it by saving the history of changes in the following way:

  • ‘a’ added in position (10,24).
  • ‘m’ deleted in position (11,24).
  • ‘ ‘ added in position (12,24).
  • ‘h’ added in position (13,24).

But in the case of Google, it looks more like this:

  • George added ‘a’ in position (10,24) at 12:43:02 pm, Friday, September 15.
  • George deleted ‘m’ in position (11,24) at 12:43:04 pm, Friday, September 15.
  • Jenny added ‘ ‘ in position (12,24) at 12:55:34 pm, Friday, September 15.
  • George added ‘h’ in position (13,24) at 12:56:09 pm, Friday, September 15.

Which might look harmless since it supports collaboration, but here’s how I see it: Google has all the data and the resources to understand and detect how you write, regardless of what you write.

Did you know that? Have you thought about it like that?

Are they actually doing it? I have no idea, but the point is that they can do it. And even if they are the nicest company on earth, it is dynamic, like every company. Who knows who will be running it in 10 years? Can we be sure they will not abuse their powers having all this data and metadata?

Stylometry is not a new thing, and we’ve already seen what metadata can get us. I wonder what the combination of the two has for us in the future.