Nov 182013
 

Today the news comes that Google and Microsoft have agreed to block child abuse images. Great!

Anyone reading (or watching) the news story could be forgiven for thinking that this will solve the problem of child abuse images on the Internet, but that won’t happen. What Microsoft and Google have done is a tiny increment on what they were already doing – instead of just excluding hosts given to them by the Internet Watch Foundation, they are also going to ‘clean up’ the search results for certain searches.

It isn’t blocking child abuse images. The search companies can’t do that; anything who thinks so needs to go and learn a bit more about the Internet which includes the government. Who have of course come out of their rabbit hutch spitting lettuce leaves everywhere, saying that if this action by the search companies isn’t effective they’ll legislate.

Which is just about the clearest evidence so far that the government is completely clueless when it comes to technology; obviously Eton‘s reputation is overstated when it comes to technology education.

People tend to think of child abuse images as being a little bit like anything else you browse to on the Internet – you just search for it, and up it pops. I haven’t tried, but I suspect what you would get is a large number of pages like this one – talking about child abuse images in some way, but no real images. Undoubtedly there are some really dumb child pornographers out there who stick up their filth on ordinary web servers; whereby they’ll quickly get indexed by the search engines and someone law enforcement bods will come pounding on the door.

However the biggest area of child abuse image distribution is likely to be one of the variety of ‘stealth’ Internets … the “dark nets’, or ‘deep web‘.

The later are web sites that cannot be indexed by the search engines for various reasons – password protection, links have never been published, etc. These would be the choice of the not quite so dumb child pornographer.

The former are harder to find – they are roughly analogous to peer-to-peer file sharing networks such as Bittorrent which is widely used for sharing copyrighted material (films, music, etc.). But ‘friend to friend’ file sharing networks are private and not public; you need an invitation to join one. This is where the intelligent child pornographer lurks.

And all the hot air we’ve heard from the government so far is going to do pretty much bugger all about the really serious stuff. If you are a clueless politician reading this, get a clue and ask someone with half a brain cell about this stuff. And don’t invent half-arsed measures before asking someone with a clue about whether they’re likely to be effective or not.

Nov 082013
 

One of the odd things about telling people about password security is that you have to learn just a little bit more than what you are saying. This leads to the frustration of not being able to talk about ideas you might have – such as that perhaps xkcd-style passwords (“word1word2word3word4”) are not quite as strong as is made out.

Not that they’re weak of course, and indeed I encourage their use wherever it is inconvenient to use “line noise” style passwords such as “JyP;$u5+Q\hzrU[C”. But how was the strength of the password “correcthorsebatterystaple” calculated? And was that calculation correct?

When we want a quantitative value for the strength of a password, it is traditional to calculate the information entropy of the password for the simple reason that the number of bits of entropy is very quickly turned into the number of password guesses necessary to get the password. Simply calculate 2^(entropy bits) and you have the number of guesses necessary to exhaustively search all possible passwords; of course on average it is only necessary to search through half of the possible passwords to guess the correct password.

To calculate the information entropy of a password is a slightly tricky calculation: entropy = log2 (number of possible symbols) ^ (length of password). Or to be more general: entropy = log2 * (number of different possible passwords). Of course this only applies to truly random passwords, so is strictly speaking the maximum possible entropy. The “log2” is base 2 logarithm which doesn’t usually appear on a calculator, but can be calculated as logx(X) = log(X)/log(2).

If we calculate the information entropy of the “correcthorsebatterystaple” we get :-

$ calc
(suppressed)
; ln(26 ^ (5 + 7 + 6 + 7))/ln(2)
	~117.51099295352730400945

Which is far in excess of the 44 bits calculated in the cartoon.

But is it correct? If you were to translate the “correcthorsebatterystaple” password into Chinese you would get “正馬電池 訂” (don’t blame me for the poor translation) which isn’t quite what I was hoping for, as it is five “symbols” instead of four … because we have four words. If we think of the password as being four symbols long, we have a somewhat different calculation where we raise “something” to the power of 4.

Now if we count all the words in the file /usr/share/dict/words as symbols (and add 96 for the ASCII character set), we end up with a total of 99267 symbols which is greater by far than the minimal 26 symbols that we started with. But the calculation comes out differently :-

$ calc
(suppressed)
; ln(99267 ^ 4)/ln(2)
	~66.39610628854963984846

Which is a lot less … we’ve gone from the verging on overkill on terms of strength towards the lower end of “strong”. But that isn’t all. If we remove all words longer than 8 characters from /usr/share/dict/words we get a total of 31321 which gives a somewhat different result :-

$ egrep -e "^.{1,7}$" /usr/share/dict/words | grep -v "'"  | wc -l
31225
$ calc
(suppressed)
; 31225 + 96
	31321
; ln(31321^4)/ln(2)
	~59.73937061801244336267

Which is now dropping to the “reasonable” category. But that still has way more words in it than are likely to be used in passwords. If we restrict it to the top 5000 most common words in episodes of The Simpsons (it happened to be a good list easily obtainable) then we go down yet again :-

; ln(5096^4)/ln(2)
	~49.26059824902520150092

Which is now in the mid-range of the “reasonable” category. We are still above the xkcd calculated value of 44 bits of entropy (they may have used NIST 800-63 to calculate the entropy). And way higher than the amount of entropy in the typical weak password … which is a simple word, or perhaps a word with a symbol added to the end. The amount of entropy in that sort of password is around 18 bits (when we treat a word as a symbol).

As a result we can conclude that :-

  1. The xkcd method results in much stronger passwords than the typical password (49 > 18).
  2. The xkcd method is much weaker than a truly random password (49 < 164). The 164 comes from calculating the entropy of a random password with a choice of 96 possible symbols.
  3. Nobody could argue that the xkcd method is a lot easier to remember than a truly random password.
  4. The xkcd method can become very weak if an attacker can predict what dictionary of words are used. For instance if passwords are generated from a very restricted set of words (say 25 words), then the entropy drops to just over 18 bits which is in the insanely weak category.

At present (as far as we know), there aren’t any tools out there to attack xkcd-style passwords but there soon will be.

Sep 292013
 

As some people know, the Linux device for generating random numbers (/dev/random) blocks when there isn’t sufficient entropy to safely generate random numbers. But people will still persist on advising using /dev/urandom as an alternative :-

“To sum up, under both Linux and FreeBSD, you should use /dev/urandom, not /dev/random.”

“Just go ahead and use /dev/urandom as is”

“Oracle wants us to move /dev/random and link /dev/urandom”

“You can remove /dev/random and link that to /dev/urandom to help prevent blocking”

Now it is true that /dev/urandom is usually good enough, but to advise people to use /dev/urandom without considering whether it is sufficient or not is irresponsible. True random numbers can be very important for cryptography, and without knowing it, we use cryptography every day; such as when we browse the web, make ssh connections, check PGP keys, etc. Using a weak random number generator can weaken the cryptographic process fatally.

Jul 232013
 

Sign me up for the perv’s list … I won’t trust a politician to come up with a sensible method of censorship, and neither should you.

Ignoring the civil liberties thing: That politicians with a censorship weapon will tend to over use it, to the eventual detriment of legitimate debate.

How is Cameron’s censorship thing supposed to work? It appears nobody has a clear idea. Probably not even Cameron himself.

It seems to be two separate measures :-

  1. Completely block “extreme” porn: child abuse images, and “rape porn”. Oddly enough, he also claimed that “50 Shades of Grey” would not be banned although there are those who categorise it as rape porn. Interestingly this is nothing new as child abuse images have been blocked for years ineffectively.
  2. An “optional” mechanism for blocking some other mysterious category of porn – the “family filter” mechanism.

Now it all sounds quite reasonable, but firstly let’s take a look at the first measure. Blocking child abuse images sounds like a great idea … and indeed it is something that is already done by the Internet Watch Foundation. Whilst their work is undoubtedly valuable – at the very least it prevents accidental exposure to child abuse images – it probably doesn’t stop anyone who is serious about obtaining access to such porn. There are just too many ways around even a country-wide block.

Onto the second measure.

This means that anyone with an Internet connection has to decide when signing up whether they want to be “family friendly” or if they want to be added to the government’s list of perverts … or possibly the ISP’s list of perverts. Of course, how quickly do you think that list will be extracted and leaked? I’m sure the gutter press is salivating at the thought of getting hold of those lists to see what famous people opt to get all the porn; the same gutter press that won’t be blocked despite publishing pictures that some might say meet the criteria for being classified as porn (see Page 3).

And who decides what gets onto the “naughty list” of stuff that you have to sign up as a perv to see? What is the betting that there will be lots of mistakes?

As we already block access by default to “adult sites” on mobile networks, I have already encountered this problem. Not as you might imagine, but whilst away on a course I used an “app” to locate hostelries around my location. On clicking on the link to take me to a local pub’s web site to see a few more details, I was blocked. The interesting thing here is that the app had no problems telling me where the pub was, but the pub’s web site was blocked. Two standards for some reason?

And there are plenty of other examples of misclassification such as Facebook’s long running problem with blocking access to breast feeding information, hospitals having to remove censorship products so that surgeons could get to breast cancer information sites, etc. I happen to work in a field where sales critters are desperate to sell censorship products, and I’m aware that many places that do install such products have the endless fun of re-classifying sites.

And finally, given this is all for the sake of the children, who thinks that children will come up with ways to get around the “family filter” anyway? It is almost impossible to completely censor Internet access without extreme measures such as pulling the entire country off the Internet – even China with it’s Great Firewall is unable to completely censor Internet activity. Solutions such as proxies, VPN access, and Tor all make censorship impossible to make totally effective. If you are thinking that this is all too technical for children, you are sorely mistaken … for a start it does not take many children able to figure this stuff out as they will distribute their knowledge.

This not to say that a censorship mechanism that you control is not a sensible idea. You can select what to censor – prevent the children getting access to information about the Flying Spaghetti Monster, but block access to other religious sites, etc. And such a product has to be network-wide, to prevent someone plugging in an uncensored device; such as using the OpenDNS FamilyShield (although I have never used it, I believe it to be a good product from independent reports). Of course even DNS blocking can be worked around, but it’s a reasonable effort.

Jun 082013
 

Which is news how exactly? Spying on us is what the NSA and GCHQ are for.

Over the last day or two, we have been hearing more and more of the activities of the NSA (here) and GCHQ (here) spying on “us” (for variable definitions of that word). Specifically on a programme called PRISM which monitors Internet traffic between the US and foreign nations, but not on communications internal to the US.

Various Internet companies have denied being involved, but :-

  1. They would have to deny involvement as any arrangement between the NSA and the company is likely to be covered by heavyweight laws regarding the disclosure of information about it.
  2. It’s also worth noting that they have asked the company executives whether they are involved in PRISM, but not asked every engineer within the company; it is doubtful in the extreme that any company executive knows everything that happens within their company. And an engineer asked to plumb in a data tap under the banner of national security is not likely to talk about it to the company executive; after all the law trumps company policy.
  3. The list of companies that have been asked, and have issued denials is a list of what the general public think of as the Internet, but in fact none of the companies are tier-1 NSP; whilst lots of interesting data could be obtained from Google, any mass surveillance programme would start with the big NSPs.

What seems to have been missed is the impact of agreements such as the UKUSA agreement on signals intelligence; the NSA is “hamstrung” (in their eyes) by being forbidden by law from spying on US domestic signals, but they are not forbidden to look at signals intelligence provided by GCHQ and visa-versa. Which gives both agencies “plausible deniability” in that they can legitimately claim that they are not spying on people from their own country whilst neglecting to mention that they make use of intelligence gathered by their opposite number.

There is some puzzlement that PRISM’s annual cost is just $20 million a year; there is really a rather obvious reason for this … and it also explains why none of the tier-1 NSPs have been mentioned so far either. Perhaps PRISM is an extension of an even more secret surveillance operation. They built (and maintain) the costly infrastructure for surveillance targeting the tier-1 NSPs and extended it with PRISM. In particular, the growing use of encryption means that surveillance at the tier-1 NSPs would be getting less and less useful (although traffic analysis can tell you a lot) making the “need” for PRISM a whole lot more necessary.

As it turns out there is evidence for this hypothesis.

But Are They Doing Anything Wrong?

Undoubtedly, both the NSA and GCHQ will claim what they are doing is within the law, and in the interests of national security. They may well be right. But unless we know exactly what they are doing, it is impossible to judge if their activities are within the law or not. And just because something is legal does not necessarily make it right.

Most people would probably agree that a mass surveillance programme may be justified if the aim is to prevent terrorism, but we don’t know that their aims are limited to that. The surveillance is probably restricted to subjects of “national interest”, but who determines what is in the national interest? Just because we think it is just about terrorism, war, and espionage doesn’t mean it is so. What is to stop the political masters of the NSA or GCHQ from declaring that it is in the national interest to spy on those involved with protests against the government, or those who vote against the government, or those who talk about taxation (i.e. tax avoidance/evasion)?

Spying is a slippery slope: It was not so very long a ago that a forerunner of the NSA was shut down by the US president of the day because “Gentlemen do not read each other’s mail.”. But intelligence is a tool that is so useful that more and more invasive intelligence methods become acceptable. It is all too easy to imagine how today’s anti-terrorist surveillance can become tomorrow’s 1984-like society.

That does not means that GCHQ should not investigate terrorism, but that it should do so in a way that we can be sure that it does not escalate into more innocent areas. Perhaps we should be allowing GCHQ to pursue surveillance, but that it should be restricted to a specified list of topics.