Nov 102013
 

Today (at least it is when I’m writing this) is Remembrance Sunday in the UK; traditionally a day to commemorate the sacrifice of ordinary men in the two world wars.

I did not watch the ceremony at The Cenotaph, or attend any of the more local ceremonies, although I have in the past. But one thing that is a noticeable change since my childhood – there is a much greater emphasis on the sacrifices made by our armed forces in all wars up to and including the present.

Fair enough; I don’t have a problem with commemorating the war dead from any war, but the the armed forces already have a day – Armed Forces Day – and Remembrance Sunday is special. It is special because it remembers the two world wars when ordinary men were called to service in their droves; whereas other wars involved soldiers, sailors, and airmen who had chosen to be shot at for a living.

Before WWI, there was nothing like Remembrance Sunday despite all the wars that the UK fought before – nothing for the Boer War, the Crimean War, the Napoleonic Wars, and nothing before. There were war memorials constructed – as a resident of Portsmouth, I can visit an unusually large number, but as for national ceremonies … excluding the burial of heros such as Nelson, they had to wait until after WWI.

Perhaps we need to move the Armed Forces Day to next to Remembrance Sunday to more clearly distinguish between the two days.

Perhaps we also need to make the commemorations somewhat less military in nature – encourage those whose relatives served in the two world wars to attend in place of them. After all the number of world war veterans is dwindling; it won’t be too long before none of them are left, and it would be a great shame to leave Remembrance Sunday to the politicians and the present-day military.

 

Nov 082013
 

One of the odd things about telling people about password security is that you have to learn just a little bit more than what you are saying. This leads to the frustration of not being able to talk about ideas you might have – such as that perhaps xkcd-style passwords (“word1word2word3word4”) are not quite as strong as is made out.

Not that they’re weak of course, and indeed I encourage their use wherever it is inconvenient to use “line noise” style passwords such as “JyP;$u5+Q\hzrU[C”. But how was the strength of the password “correcthorsebatterystaple” calculated? And was that calculation correct?

When we want a quantitative value for the strength of a password, it is traditional to calculate the information entropy of the password for the simple reason that the number of bits of entropy is very quickly turned into the number of password guesses necessary to get the password. Simply calculate 2^(entropy bits) and you have the number of guesses necessary to exhaustively search all possible passwords; of course on average it is only necessary to search through half of the possible passwords to guess the correct password.

To calculate the information entropy of a password is a slightly tricky calculation: entropy = log2 (number of possible symbols) ^ (length of password). Or to be more general: entropy = log2 * (number of different possible passwords). Of course this only applies to truly random passwords, so is strictly speaking the maximum possible entropy. The “log2” is base 2 logarithm which doesn’t usually appear on a calculator, but can be calculated as logx(X) = log(X)/log(2).

If we calculate the information entropy of the “correcthorsebatterystaple” we get :-

$ calc
(suppressed)
; ln(26 ^ (5 + 7 + 6 + 7))/ln(2)
	~117.51099295352730400945

Which is far in excess of the 44 bits calculated in the cartoon.

But is it correct? If you were to translate the “correcthorsebatterystaple” password into Chinese you would get “正馬電池 訂” (don’t blame me for the poor translation) which isn’t quite what I was hoping for, as it is five “symbols” instead of four … because we have four words. If we think of the password as being four symbols long, we have a somewhat different calculation where we raise “something” to the power of 4.

Now if we count all the words in the file /usr/share/dict/words as symbols (and add 96 for the ASCII character set), we end up with a total of 99267 symbols which is greater by far than the minimal 26 symbols that we started with. But the calculation comes out differently :-

$ calc
(suppressed)
; ln(99267 ^ 4)/ln(2)
	~66.39610628854963984846

Which is a lot less … we’ve gone from the verging on overkill on terms of strength towards the lower end of “strong”. But that isn’t all. If we remove all words longer than 8 characters from /usr/share/dict/words we get a total of 31321 which gives a somewhat different result :-

$ egrep -e "^.{1,7}$" /usr/share/dict/words | grep -v "'"  | wc -l
31225
$ calc
(suppressed)
; 31225 + 96
	31321
; ln(31321^4)/ln(2)
	~59.73937061801244336267

Which is now dropping to the “reasonable” category. But that still has way more words in it than are likely to be used in passwords. If we restrict it to the top 5000 most common words in episodes of The Simpsons (it happened to be a good list easily obtainable) then we go down yet again :-

; ln(5096^4)/ln(2)
	~49.26059824902520150092

Which is now in the mid-range of the “reasonable” category. We are still above the xkcd calculated value of 44 bits of entropy (they may have used NIST 800-63 to calculate the entropy). And way higher than the amount of entropy in the typical weak password … which is a simple word, or perhaps a word with a symbol added to the end. The amount of entropy in that sort of password is around 18 bits (when we treat a word as a symbol).

As a result we can conclude that :-

  1. The xkcd method results in much stronger passwords than the typical password (49 > 18).
  2. The xkcd method is much weaker than a truly random password (49 < 164). The 164 comes from calculating the entropy of a random password with a choice of 96 possible symbols.
  3. Nobody could argue that the xkcd method is a lot easier to remember than a truly random password.
  4. The xkcd method can become very weak if an attacker can predict what dictionary of words are used. For instance if passwords are generated from a very restricted set of words (say 25 words), then the entropy drops to just over 18 bits which is in the insanely weak category.

At present (as far as we know), there aren’t any tools out there to attack xkcd-style passwords but there soon will be.

Nov 072013
 

I am not aware of how widely it has been publicised, but it was certainly a surprise to me … and a few others. After upgrading to OSX 10.9 (Mavericks), and when going into the “Network Settings”, a new network device appeared. Specifically something called the “Thunderbolt Bridge”.

It turns out that this is a new feature of OSX where you can connect two Macs (or potentially other kinds of system) together with a thunderbolt cable, and run IP over that connection. Which is fast compared with normal ethernet, although it is comparable with 10GE and slower than 40GE, and 100GE (and of course slower than modern InfiniBand).

But it probably isn’t as usable as 10GE at present for the following reasons :-

  1. There’s no networking hardware support for thunderbolt networks at present.
  2. The cables are different, so offices would have to be rewired for it. Which would be very expensive.
  3. It turns out that thunderbolt takes a lot of processor power to run networking at that speed.

That’s not to say it won’t make a fine way to connect two Macs together for data transfers. With any luck (unfortunately I can’t test it), it should be a case of just plugging the two Macs together and using normal sharing mechanisms to do a transfer. Both sides should automatically configure an IP address, so normal networking services should be able to see the other Mac across the “Thunderbolt Bridge”.

The only problem with this new feature is the name – “Thunderbolt Bridge” – which might be user-friendly, but is likely to make networking people flinch. User configured network bridges have been known to cause problems!

For more details have a look here.

Nov 022013
 

The DNS (the domain name system) is one of those Internet services that everybody uses; but most don’t even know it exists. That is partially a good thing – it is supposed to be invisible in the sense that it just works rather than causing problems. But everything – Internet Explorer, Firefox, Chrome, and anything that uses the network – uses the DNS.

But What Does It Do?

The DNS in a very simple sense is the way that applications such as Chrome (or any web browser) finds out what network address a name points to. When we visit a web page such as http://www.bbc.co.uk/, the web browser needs to know what network address to make a connection to. So the web browser asks the DNS “what network address does www.bbc.co.uk point to” and the DNS answers “212.58.246.92 and 212.58.246.93” (as of the time of writing). The DNS does quite a bit more than that – even ignoring the details of how the servers operate – as it can answer other kinds of questions than just what the network address of a name is. But the process works pretty much the same way whatever kind of question is asked, so we’ll concentrate on the name questions. Technically the name www.bbc.co.uk is a fully-qualified domain name, and the network address is either an IPv4 address or an IPv6 address which can be seen if we perform a lookup on www.google.co.uk instead of www.bbc.co.uk (as the BBC doesn’t have an IPv6 address as yet) :-

# host www.google.co.uk 
www.google.co.uk has address 74.125.132.94
www.google.co.uk has IPv6 address 2a00:1450:400c:c06::5e

That’s a command-line way of performing a DNS lookup, which is rather irrelevant to this discussion except that it shows just the DNS answer.

So How Does It Work?

When you perform a DNS lookup (or more usually an application performs a DNS lookup on your behalf), it makes use of a piece of software on your computer called the resolver. This is more complex than is described here, and can use mechanisms other than the DNS. But ignoring all of that, the resolver composes a question in terms that a DNS server would understand. It then sends the question to all of the DNS servers it knows about.

Hopefully one or more of those DNS servers will answer the question, and the application can get on with whatever it is doing.

If an answer is not returned, the question is sent again, and this carries on until the resolver decides that enough is enough and returns an error to the application. Which of course results in an unexpected error such as a web browser saying that Google doesn’t exist!

There’s a fair bit more to it than this of course – particularly how the DNS servers find out the answer to your question, but this is enough for now.