More Zonkyness

How Strong Are XKCD-Style Passwords? And Other Thoughts

Nov 082013

One of the odd things about telling people about password security is that you have to learn just a little bit more than what you are saying. This leads to the frustration of not being able to talk about ideas you might have – such as that perhaps xkcd-style passwords (“word1word2word3word4”) are not quite as strong as is made out.

Not that they’re weak of course, and indeed I encourage their use wherever it is inconvenient to use “line noise” style passwords such as “JyP;$u5+Q\hzrU[C”. But how was the strength of the password “correcthorsebatterystaple” calculated? And was that calculation correct?

When we want a quantitative value for the strength of a password, it is traditional to calculate the information entropy of the password for the simple reason that the number of bits of entropy is very quickly turned into the number of password guesses necessary to get the password. Simply calculate 2^(entropy bits) and you have the number of guesses necessary to exhaustively search all possible passwords; of course on average it is only necessary to search through half of the possible passwords to guess the correct password.

To calculate the information entropy of a password is a slightly tricky calculation: entropy = log2 (number of possible symbols) ^ (length of password). Or to be more general: entropy = log2 * (number of different possible passwords). Of course this only applies to truly random passwords, so is strictly speaking the maximum possible entropy. The “log2” is base 2 logarithm which doesn’t usually appear on a calculator, but can be calculated as logx(X) = log(X)/log(2).

If we calculate the information entropy of the “correcthorsebatterystaple” we get :-

$ calc
(suppressed)
; ln(26 ^ (5 + 7 + 6 + 7))/ln(2)
	~117.51099295352730400945

Which is far in excess of the 44 bits calculated in the cartoon.

But is it correct? If you were to translate the “correcthorsebatterystaple” password into Chinese you would get “正馬電池訂” (don’t blame me for the poor translation) which isn’t quite what I was hoping for, as it is five “symbols” instead of four … because we have four words. If we think of the password as being four symbols long, we have a somewhat different calculation where we raise “something” to the power of 4.

Now if we count all the words in the file /usr/share/dict/words as symbols (and add 96 for the ASCII character set), we end up with a total of 99267 symbols which is greater by far than the minimal 26 symbols that we started with. But the calculation comes out differently :-

$ calc
(suppressed)
; ln(99267 ^ 4)/ln(2)
	~66.39610628854963984846

Which is a lot less … we’ve gone from the verging on overkill on terms of strength towards the lower end of “strong”. But that isn’t all. If we remove all words longer than 8 characters from /usr/share/dict/words we get a total of 31321 which gives a somewhat different result :-

$ egrep -e "^.{1,7}$" /usr/share/dict/words | grep -v "'"  | wc -l
31225
$ calc
(suppressed)
; 31225 + 96
	31321
; ln(31321^4)/ln(2)
	~59.73937061801244336267

Which is now dropping to the “reasonable” category. But that still has way more words in it than are likely to be used in passwords. If we restrict it to the top 5000 most common words in episodes of The Simpsons (it happened to be a good list easily obtainable) then we go down yet again :-

; ln(5096^4)/ln(2)
	~49.26059824902520150092

Which is now in the mid-range of the “reasonable” category. We are still above the xkcd calculated value of 44 bits of entropy (they may have used NIST 800-63 to calculate the entropy). And way higher than the amount of entropy in the typical weak password … which is a simple word, or perhaps a word with a symbol added to the end. The amount of entropy in that sort of password is around 18 bits (when we treat a word as a symbol).

As a result we can conclude that :-

The xkcd method results in much stronger passwords than the typical password (49 > 18).
The xkcd method is much weaker than a truly random password (49 < 164). The 164 comes from calculating the entropy of a random password with a choice of 96 possible symbols.
Nobody could argue that the xkcd method is a lot easier to remember than a truly random password.
The xkcd method can become very weak if an attacker can predict what dictionary of words are used. For instance if passwords are generated from a very restricted set of words (say 25 words), then the entropy drops to just over 18 bits which is in the insanely weak category.

At present (as far as we know), there aren’t any tools out there to attack xkcd-style passwords but there soon will be.

OSX: The “Thunderbolt Bridge”

Computing No Responses »

Nov 072013

I am not aware of how widely it has been publicised, but it was certainly a surprise to me … and a few others. After upgrading to OSX 10.9 (Mavericks), and when going into the “Network Settings”, a new network device appeared. Specifically something called the “Thunderbolt Bridge”.

It turns out that this is a new feature of OSX where you can connect two Macs (or potentially other kinds of system) together with a thunderbolt cable, and run IP over that connection. Which is fast compared with normal ethernet, although it is comparable with 10GE and slower than 40GE, and 100GE (and of course slower than modern InfiniBand).

But it probably isn’t as usable as 10GE at present for the following reasons :-

There’s no networking hardware support for thunderbolt networks at present.
The cables are different, so offices would have to be rewired for it. Which would be very expensive.
It turns out that thunderbolt takes a lot of processor power to run networking at that speed.

That’s not to say it won’t make a fine way to connect two Macs together for data transfers. With any luck (unfortunately I can’t test it), it should be a case of just plugging the two Macs together and using normal sharing mechanisms to do a transfer. Both sides should automatically configure an IP address, so normal networking services should be able to see the other Mac across the “Thunderbolt Bridge”.

The only problem with this new feature is the name – “Thunderbolt Bridge” – which might be user-friendly, but is likely to make networking people flinch. User configured network bridges have been known to cause problems!

For more details have a look here.

DNS: How Does It Work?

Computing No Responses »

Nov 022013

The DNS (the domain name system) is one of those Internet services that everybody uses; but most don’t even know it exists. That is partially a good thing – it is supposed to be invisible in the sense that it just works rather than causing problems. But everything – Internet Explorer, Firefox, Chrome, and anything that uses the network – uses the DNS.

But What Does It Do?

The DNS in a very simple sense is the way that applications such as Chrome (or any web browser) finds out what network address a name points to. When we visit a web page such as http://www.bbc.co.uk/, the web browser needs to know what network address to make a connection to. So the web browser asks the DNS “what network address does www.bbc.co.uk point to” and the DNS answers “212.58.246.92 and 212.58.246.93” (as of the time of writing). The DNS does quite a bit more than that – even ignoring the details of how the servers operate – as it can answer other kinds of questions than just what the network address of a name is. But the process works pretty much the same way whatever kind of question is asked, so we’ll concentrate on the name questions. Technically the name www.bbc.co.uk is a fully-qualified domain name, and the network address is either an IPv4 address or an IPv6 address which can be seen if we perform a lookup on www.google.co.uk instead of www.bbc.co.uk (as the BBC doesn’t have an IPv6 address as yet) :-

# host www.google.co.uk 
www.google.co.uk has address 74.125.132.94
www.google.co.uk has IPv6 address 2a00:1450:400c:c06::5e

That’s a command-line way of performing a DNS lookup, which is rather irrelevant to this discussion except that it shows just the DNS answer.

So How Does It Work?

When you perform a DNS lookup (or more usually an application performs a DNS lookup on your behalf), it makes use of a piece of software on your computer called the resolver. This is more complex than is described here, and can use mechanisms other than the DNS. But ignoring all of that, the resolver composes a question in terms that a DNS server would understand. It then sends the question to all of the DNS servers it knows about.

Hopefully one or more of those DNS servers will answer the question, and the application can get on with whatever it is doing.

If an answer is not returned, the question is sent again, and this carries on until the resolver decides that enough is enough and returns an error to the application. Which of course results in an unexpected error such as a web browser saying that Google doesn’t exist!

There’s a fair bit more to it than this of course – particularly how the DNS servers find out the answer to your question, but this is enough for now.

It’s Not Human Trafficking; It’s Slave Trading

General, Politics No Responses »

Nov 022013

Why on earth have we got this new name – Human Trafficking – for the very old crime of slavery and slave trading? Is it some kind of attempt at putting a trendy new gloss on it? It’s not a crime that should have a trendy new gloss; even ignoring the fact that it is the kind of crime that shouldn’t be glamorised in any way, there’s a very good legal reason why we should carry on calling it slavery and slave trading.

Back in the 19th century, the British unilaterally declared that slavery and slave trading would be treated the same as piracy and set about (with the assistance of the US) eliminating the African slave trade. Under the principle of jus cogens they set about hanging slavers, confiscating their assets, and freeing slaves claiming that they had a universal right to punish those who took part in the crime of slavery.

In other words, some crimes are so heinous that anyone is allowed to prosecute offenders no matter where or when the offenses took place.

By keeping the old name for the crime, we retain it’s classification as a crime subject to universal jurisdiction. This opens the possibility of setting up a court – such as the ICC – to prosecute slavers wherever in the world they are, and the possibility of empowering law enforcement units to bring slavers to justice wherever they happen to be.

And after all, the fight against slavery isn’t going too well with more slaves today than there has ever been.

And Apologies For Any Disruption …

Computing 2 Responses »

Oct 252013

As you may have guessed from my previous post, my server is currently being upgraded so there may be episodes when it isn’t available.

Hopefully nothing dramatic.

If this distresses you bear in mind that I don’t get enough banner ad clicks to run a resilient service! At least not under the stairs.

Older Entries Newer Entries