Tag Archives: encryption

No Knowing November

No matter where you consume the news, there is no escaping the revelations continually coming out of PRISM and MUSCULAR and their impact around the globe. At its root, it uncovered a dangerous problem – privacy online is indeed threatened at every level.

Since its inception in 2007, SpiderOak has been focused on preserving our users’ privacy through the implementation of ‘Zero-Knowledge’ technologies – the privacy-first orientation that ensures the server never knows what data it is storing. How is this accomplished? By never storing the encryption keys and therefore never having plaintext access to the data. Ultimately, this is the only way to give ownership and control back to the user and – thus – ensure privacy throughout the process.

Back in January – when everyone was talking about the importance of security - we had the foresight to call 2013 the Year of Privacy. As we have seen, security only solves half of the problem. When a company retains the keys to the data, it also maintains the ability to access it. The access can then be used in a number of damaging ways as has been exposed back in June.

SpiderOak, Zero Knowledge, Privacy, No

Help us make this month NO KNOWING NOVEMBER by sharing this critical message on privacy through ‘No Knowing!’


  • Promote privacy through #NoKnowing
  • Use any of our ‘No Knowing’ images
Screen Shot 2013-06-17 at 5.28.13 PM

Securing Your Mail From Site to Site

Many of you know how to secure your email between your mail client and your computer. But if you run your own mail server, did you know you can secure email between servers? Many servers support TLS encryption for outgoing connections, which will protect your mail between your server and the next one. For my favorite mail server, Postfix, add this to your main.cf:

smtp_tls_security_level = may

This will enable “opportunistic” TLS for outbound connections, meaning it will use encryption if the remote server supports it, otherwise it will transmit it unencrypted. If you’re really paranoid and don’t want to talk to servers that don’t support encryption, you can change may to verify or secure to ensure that the remote end uses encryption.

To ensure that your server listens for TLS requests, add this:

smtpd_tls_security_level = may
smtpd_tls_cert_file = ...
smtpd_tls_key_file = ...

Note the small difference between smtp_... and smtpd_. The cert and key parameters configure your SSL certificate. You can also use encrypt here instead of may to force encryption for clients, but this isn’t recommended for a public Internet server.

By default, if Exim is compiled with TLS support, it will attempt TLS for outbound connections. If you want it to accept TLS, though, you’ll have to set:

tls_advertise_hosts = *
tls_certificate = ...
tls_privatekey = ...

It’s important to note that even with these configurations, you can’t guarantee that your mail is completely encrypted in transit, since your mail could be transmitted between several servers. It also doesn’t prevent eavesdropping on the servers themselves. If you want to ensure that only the recipient can read your mail, you should use something like PGP.

I’ll leave other mail servers as an exercise to the reader. Feel free to post further configuration or notes in the comments!

Security, Privacy & Encryption 101 Roundup

As you know, privacy and security is not something we take lightly. In our efforts to help educate our fellow humans on their importance and the role they play in our lives on and offline, we’ve compiled the below list of recent news, resources and tips.

[For the past few weeks we've focused on encryption. If you missed them: Just Because It's Encrypted Doesn't Mean It's Private and Encryption 101.]

If you would like to share links or resources we’ve missed, we encourage you to do so below.

May Highlight


News & Information



Interesting Reads



  • Don’t send sensitive information over the Internet before checking a website’s security
  • Pay attention to the URL of a website. Malicious websites may look identical to a legitimate site, but the URL may use a variation in spelling or a different domain (e.g., .com vs. .net)
  • Install and maintain anti-virus software, firewalls, and email filters to reduce suspicious traffic
  • Don’t use passwords that are based on personal information that can be easily accessed or guessed
  • Use both lowercase and capital letters in your passwords
  • Use different passwords on different systems
  • Do business with credible companies
  • Do not use your primary email address in online submissions
  • Devote one credit card to online purchases
  • Encrypting data is a good way to protect sensitive information. It ensures that the data can only be read by the person who is authorized to have access to it
  • Use two-factor authentication if available (coming soon to SpiderOak)
  • Back up all of your data on a regular basis

Just Because It’s Encrypted Doesn’t Mean It’s Private

Now that you’ve got a handle on what encryption is and what it can do, it’s important to understand what it can’t do.

Encryption is a tool, and like any tool, it can be used improperly or ineffectively. It may sound a bit strange for us at SpiderOak to disclaim the benefits of encryption, but I hope to show that while encryption is necessary for privacy, it’s not always sufficient.

One prime example of the utility of encryption is HTTPS. By wrapping encryption around regular HTTP, engineers have created a powerful tool for securing content both delivered to you and provided by you. But HTTPS only protects content while it’s in transit. HTTPS will protect your credit card numbers as they travel over the internet to a merchant, but once they arrive on the other end, they’re no longer encrypted and it’s up to the merchant and credit card providers to protect them. Credit card providers and banks have developed PCI DSS regulations to tightly specify the security of credit card processing, but as the frequency of credit card breaches demonstrates, these regulations aren’t sufficient to guarantee privacy.

Another great cryptographic tool is Full Disk Encryption. Whether built-in to your computer hardware or provided by software like TrueCrypt, FDE protects the contents of your hard drive by encrypting every last bit. Anyone who steals your hard drive will find it completely unreadable. But while you’re using the drive, it is readable. While you have your computer on and the drive unlocked, any malicious piece of software running on your computer will find all of your data fully readable. FDE is a valuable tool, but it can only guarantee privacy while the disk isn’t in use.

Privacy is a complex problem that requires attention to many details, one of which is encryption. We’ve tried our best to provide you with the best privacy possible for your important data. If you’re interested in more details about how we protect your privacy, please read our Engineering page. And feel free to ask us about it, we’re always willing to brag!

Exploit Information Leaks in Random Numbers from Python, Ruby and PHP

The Mersenne Twister (MT 19937) is a pseudorandom number generator, used by Python and many other languages like Ruby, and PHP. It is known to pass many statistical randomness tests, but it’s also known to be not cryptographically secure. The Python documentation is clear on this point, describing it as “completely unsuitable for cryptographic purposes.” Here we will show why.

When you are able to predict pseudorandom numbers, you can predict session ids, randomly generated passwords or encryption keys and know all the cards in online poker games, or play “Asteroids” better than legally possible.

Many sources already showed that it’s easy to rebuild the internal state of the MT by using 624 consecutive outputs. But this alone isn’t a practical attack, it’s unlikely that you have access to the whole output. In this post I’ll demonstrate how to restore its internal state by using only parts of its output. This will allow us to know all previous and future random number generation.

With every 32bit output the MT directly exposes 32 bit of it’s internal state (only slightly and reversibly modified by the tempering function). After each round of 624 outputs, the internal state of the Mersenne Twister is “twisted” itself: All bits are XOR’d with several other bits. In fact the Mersenne Twister is just a big XOR machine: All its output can be expressed by an sequence of XORs of the initial state bits.

Python always combines two outputs into a 64bit integer before returning them as random integers. So each call of random.randint(0,255) gives you only 8 bits out of two 32 bit Mersenne Twister outputs. Since the tempering function already mixed the 32 bits outputs, it’s not possible anymore to directly recover internal state bits out of only the 8 bits.

I was curious if it’s hard to recover the internal MT state by using only the output of a function like this:

def random_string(length):
    return "".join(chr(random.randint(0, 255)) for i in xrange(length))

Since the internal state of the Mersenne Twister consists out of 19968 bits we will need at least ~2.5KB of output to recover the internal state. In fact I needed ~3.3kb, probably because of redundant output information. Also possible is a bug in my POC implementation :)

You can find the result on github.

How does it work?

First I named the initial state with variables s0…s19967. The initial state looks like this:

Internal state bit Value
0 s0
1 s1
19967 s19967

Now the first output of the Mersenne Twister is a combination of the first 32 bits (combined by the tempering function):

Output-Bit Value
o0 s0 xor s4 xor s7 xor s15
o1 s1 xor s5 xor s16
o2 s2 xor s6 xor s13 xor s17 xor s24
o31 s2 xor s9 xor s13 xor s17 xor s28 xor s31

same for the second output:

Output-Bit Value
o32 s32 xor s36 xor s39 xor s47,

But we can only observe eight of these bits, because random.randint(0,255) exposes only this portion of the output.

After 624 outputs, the internal state of the Mersenne Twister is “twisted” around. We update our internal state as an xor-combination of our old indices.

Internal state bit Value
0 s63 xor s12704
1 s0 xor s12705
19967 s61 xor s62 xor s5470 xor s5471 xor s18143

The outputs look now more complicated now, because the state bits are an xor-combination of the initial state:

Output-Bit Value
o19968 s35 xor s38 xor s46 xor s63 xor s12704 xor s12708 xor s12711 xor s12719

After 3.3 kb this list contains about 40 variables.

Now we have a big list of output-bits and how they are made out of an xor-combination of the original state. A big system of equations that we can to solve! This is done as you learned it at school: Here’s a simple example for 3 bits.

Given this equations system:

o1 = s0 xor s1 xor s2
o2 = s1 xor s2
o3 = s0 xor s1
First we solve s0:
o1 = s0 xor s1 xor s2
o2 = s1 xor s2
o1 xor o2 = s0
With this solution it’s easy to find solution for s1.
o3 = s0 xor s1
o1 xor o2 = s0
o1 xor o2 xor o3 = s1
And finally for s2.
o2 = s1 xor s2
o1 xor o2 xor o3 = s1
o1 xor o3 = s2
o1 xor o2 = s0
o1 xor o2 xor o3 = s1
o1 xor o3 = s2

Now we know how to recover the 3-bit state out of our 3 output-bits:
s0 = o1 xor o2
s1 = o1 xor o2 xor o3
s2 = o1 xor o3

However, in reality we have about 26,000 equations with 20,000 variables.

If you want to try it yourself, you can download the the result of the solved equation together with a test-program on github.

Further notes

Since the Mersenne Twister is highly symmetric, it’s probably possible to find some shortcuts or a fully mathematical solution for this problem. However, I implemented the straight-forward solution since it’s easy and reusable.

Python seeds the Twister with only 128 bits of “real” randomness. So theoretically it’s enough to know a few output bytes to restore the whole state, but you would need an efficient attack on the seeding algorithm since 128 bit is too much for a brute-force attack.

However, other implementations use much less randomness to seed their random number generators. PHP seems to use only 32 bits for seeding mt_random, Perl also uses only 32 bit (but another PRNG). In these cases it’s probably easier to use a brute-force attack on the seed.

Online Privacy – Strange Bedfellows…

Normally, when people think of ‘online’, privacy is definitely not the first, second, or fiftieth thought that comes to mind. If fact, people generally exhibit quite the opposite response and conjure up images of complete nakedness. After all, the modern-day Internet has evolved mostly for the purpose of providing instant exposure, distribution, and presence to the world over. The question then becomes, can the value of the Internet extend beyond nakedness?

One of the driving purposes behind SpiderOak was to dispel the notion that just because data is online means it can no longer be private. The goal was simple – devise a plan where a user’s files, filenames, file types, folders, and/or any other personal information is never exposed to anyone for any reason (even under government subpoena). This of course includes the SpiderOak staff who – even with physical access to the servers upon which the data resides – should never be able to see or interact with a user’s plaintext data. Creating this environment, however, would prove more difficult than simply making these statements.

In the beginning, we grappled with how best to accomplish this feat – creating ‘Zero-Knowledge’ privacy as we call it. Most of our competitors and thousands of other companies make claims and statements about security and privacy but, at the end of the day, they would all fall short of achieving our aforementioned goals. To use the most general example – if a company can reset your password, it means someone in the company has access to your encryption keys (if they encrypt the data) which further means they can access your data if they ‘had’ to or, worse yet, someone else could with far worse intentions.

A more specific case is Mozy’s use of encryption. Mozy’s encryption is far better than most online storage providers and yet it contains serious oversights. The default options have you choosing between a stronger ‘Mozy’ key (which Mozy then knows and could use to decrypt your data) or a weaker key you choose on your own and keep private. Even if you choose the weaker private key, Mozy still stores your file and folder names in plain text – meaning they know a list of every file archived from your computer. We would suspect they know the size and timestamp of each file as well although this information has not been publicly disclosed. This seems to represent a great deal of information to reveal about the contents of your ‘private’ data, doesn’t it?

To overcome this threat and others, we at SpiderOak decided to never store a user’s password nor the plaintext of a user’s encryption keys. This ensures that there can never be a point – ever – where we could even unknowingly betray the trust or privacy of a user. Why? Because – to put it simply – we don’t ever come into contact with the keys needed to unlock the encryption surrounding the data. Even with physical access to the server or under subpoena, SpiderOak simply can never see or turn over a user’s plaintext files, filenames, file sizes, file types, etc… On the server, we only see sequentially numbered containers of encrypted data.

This necessarily meant a different approach to various processes throughout SpiderOak which you may or may not have noticed – including forced registration through the desktop application and never via the web. In the
end, however, we did accomplish our goals and proved that, although strange bedfellows indeed, ‘online’ and ‘privacy’ can sleep next to each other every night, naked, and live happily ever after…