From http://xkcd.com/936/

I love this cartoon. It makes a great point really simply.

However, it’s a bit, well, misleading…

The cartoon suggests that Tr0ub4dor&3 is massively more susceptible to an attack than correcthorsebatterystaple because of the difference in entropy.

Assuming the math is sound (I calculated using actual password space rather than entropy) I still have a problem with how it might be interpreted.

In the cartoon Tr0ub4dor&3 loses entropy points because it’s based on a non-random word with substitutions. Ok, that’s fair enough. Lots of people do create p4ssw0rds this way so it seems reasonable to punish this with lower entropy.

In short, this password is punished because the format is predictable.

However, if we punish that password for a predictable format, it’s also fair to say that correcthorsebatterystaple is thus susceptible to a dictionary attack. Conversely, Tr0ub4dor&3 is entirely secure against such an attack.

A pure brute force attack against Tr0ub4dor&3 with a 1000 guesses a second would take 180 billion years.

The same attack against correcthorsebatterystaple would take 7.5 billion billion years.

The difference is so gigantic it’s almost inconceivably massive.

However, if we consider a dictionary attack using 860,000 words against correcthorsebatterystaple at 1000 guesses a second we’re looking at 17,345 billion years.

Suddenly, correcthorsebatterystaple is a lot less strong.

In fact, if you add a single _ to the end of Tr0ub4dor&3 it now takes 17,134 billion years to brute force. That’s very comparable.

To batter the correcthorsebatterystaple example even more we could alter our dictionary attack and remove all the words that are shorter than 4 characters from the 860,000 total. This would be reasonable as you would want your four word password to be at least 16 characters long.

However, I can’t deny that a pure brute force attack on Tr0ub4dor&3 would take less time than a full dictionary attack on correcthorsestaplebattery and the latter is much easier to remember.

So it’s still an amazing bit of work – it just helps to understand the details.

So… what?

Well, basically, two things.

1) using a format that helps you remember your passwords is a good idea – whether you combine 4 random words or use substitutions – the weakness comes when someone PREDICTS the format.

For example, using 4 random dictionary words AND making a single typical substitution of an alpha char for a numeric char (e.g. substituting a 0 for an o) secures it completely against a dictionary attack and the hacker would have to resort to a brute force attack. However, if the hacker KNOWS that you did this, they know they can modify their dictionary attack instead.

2) there is no substitute for length when it comes to passwords.

For example, take a simple 8 char lower case password. Changing one of those 8 chars to a number means it takes 10 times longer to brute force. However, adding another lowercase char would take 26 times longer.

That’s a generous example but it makes the point. Having a short but complex password like #9Nj and then typing it 4 times is extremely resilient against a brute force attack.

Just make sure no-one is looking over your shoulder…