AMD – More Zonkyness

Well, there’s another big and bad security vulnerability; actually there are three. These are known as Meltdown and Spectre (two different Spectres). There are all sorts of bits of information and misinformation out there at the moment and this posting will be no different.

In short, nobody but those involved in the vulnerability research or implementing work-arounds within the wel-known operating systems really knows these vulnerabilities well enough to say anything about them with complete accuracy.

The problem is that both vulnerabilities are exceptionally technical and require detailed knowledge of technicalities that most people are not familiar with. Even people who work in the IT industry.

Having said that I’m not likely to be 100% accurate, let’s dive in …

What Is Vulnerable?

For Meltdown, every modern Intel processor is vulnerable; in fact the only processors from Intel that are not vulnerable are only likely to be encountered in retro-computing. Processors from AMD and ARM are probably not vulnerable, although it is possible to configure at least one AMD processor in such a way that it becomes vulnerable.

It appears that that more processors are likely to be vulnerable to the Spectre vulnerabilities. Exactly what is vulnerable is a bit of work to assess, and people are concentrating on the Meltdown vulnerability as it is more serious (although Spectre is itself serious enough to qualify for a catchy code name).

What Is The Fix?

Replace the processor. But wait until fixed ones have been produced.

However there is a work-around for the Meltdown vulnerability, which is an operating system patch (to fix the operating system) and a firmware patch (to fix the UEFI environment). All of the patches “fix” the problem by removing kernel memory from the user memory map, which stops user processes exploiting Meltdown to read kernel memory.

Unfortunately there is a performance hit with this fix; every time you call the operating system (actually the kernel) to perform something, the memory map needs to be loaded with the kernel maps and re-loaded with the old map when the routine exits.

This “costs” between 5% and 30% when performing system calls. With very modern processors the performance hit will be consistently 5% and with older processors the hit will be consistently 30%.

Having said that, this only happens when calling the operating system kernel, and many applications may very well make relatively few kernel operating system calls in which case the performance hit will be barely noticeable. Nobody is entirely sure what the performance hit will be for real world use, but the best guesses say that most desktop applications will be fine with occasional exceptions (and the web browser is likely to be one); the big performance hit will be on the server.

How Serious Are They?

Meltdown is very serious not only because it allows a user process to read privileged data, but because it allows an attacker to effectively remove a standard attack mitigation which makes many older-style attacks impracticable. Essentially it make older-style attacks practicable again.

Although Spectre is still serious, it may be less so than Meltdown because an attacker needs to be able to control some data that the victim process uses to indulge in some speculative execution. In the case of browsers (for example) this is relatively easy, but in general it is not so easy.

It is also easier to fix and/or protect against on an individual application basis – expect browser patches shortly.

Some Technicalities

Within this section I will attempt to explain some of the technical aspects of the vulnerabilities. By all means skip to the summary if you wish.

The Processor?

Normally security vulnerabilities are found within software – the operating system, or a ‘layered product’ – something installed on top of the operating system such as an application, a helper application, or a run-time environment.

Less often we hear of vulnerabilities that involve hardware in some sense – requiring firmware updates to either the system itself, graphics cards, or network cards.

Similar to firmware updates, it is possible for microcode updates to fix problems with the processor’s instructions.

Unfortunately these vulnerabilities are not found within the processor instructions, but in the way that the processor executes those instructions. And no microcode update can fix this problem (although it is possible to weaken the side-channel attack by making the cache instructions execute in a fixed time).

Essentially the processor hardware needs to be re-designed and new processors released to fix this problem – you need a new processor. The patches for Meltdown and Spectre – both the ones available today, and those available in the future – are strictly speaking workarounds.

The Kernel and Address Space

Meldown specifically targets the kernel and the kernel’s memory. But what is the kernel?

It is a quite common term in the Linux community, but every single mainstream has the same split between kernel mode and user mode. Kernel mode has privileged access to the hardware whereas user mode is prevented from accessing the hardware and indeed the memory of any other user process running. It would be easy to think of this as the operating system and user applications, but that would be technically incorrect.

Whilst the kernel is the operating system, plenty of software that runs in user mode is also part of the operating system. But the over-simplification will do because it contains a useful element of the truth.

Amongst other things the kernel address space contains many secrets that user mode software should not have access to. So why is the kernel mode address space overlaid upon the user mode address space?

One of the jobs that the kernel does when it starts a user mode process, is give to that process a virtual view of the processor’s memory that entirely fills the processor’s memory addressing capability – even if that it is more memory than the machine contains. The reasons for this can be ignored for the moment.

If real memory is allocated to a user process, it can be seen and used by that process and no other.

For performance reasons, the kernel includes it’s own memory within each user process (but protected). It isn’t necessary, but re-programming the memory management unit to map the kernel memory for each system call is slower than not. And after all, memory protection should stop user processes reading kernel memory directly.

That is of course unless memory protection is broken …

Speculative Execution

Computer memory is much slower than modern processors which is why we have cache memory – indeed multiple levels of cache memory. To improve performance processors have long been doing things that come under the umbrella of ‘speculative execution’.

If for example we have the following sample of pseudo-code :-

load variable A from memory location A-in-memory
if A is zero
then
do one thing
else
do another
endif

Because memory is so slow, a processor running this code could stop whilst it is waiting for the memory location to be read. This is how processors of old worked, and is often how processor execution is taught - the next step starts getting really weird.

However it could also execute the code assuming that A will be zero (or not, or even both), so it has the results ready for once the memory has been read. Now there are some obvious limitations to this - the processor can't turn your screen green assuming that A is zero, but it can sometimes get some useful work done.

The problem (with both Meltdown and Spectre) is that speculative execution seems to bypass the various forms of memory protection. Now whilst the speculative results are ignored once the memory is properly read, and the memory protection kicks in, there is a side-channel attack that allows some of the details of the speculative results to be sniffed by an attacker.

Summary

Don't panic! These attacks are not currently in use and because of the complexity it will take some time for the attacks to appear in the wild.
Intel processors are vulnerable to Meltdown, and will need a patch to apply a work-around. Apply the patch as soon as it comes out even if it hurts performance.
The performance hit is likely to be significant only on a small set of applications, and in general only significant on a macro-scale - if you run as many servers as Google, you will have to buy more servers soon.
Things are a little more vague with Spectre, but it seems likely that individual applications will need to be patched to remove their vulnerability. Expect more patches.

Tunnel To The Old Town

I recently replaced an elderly SGI Octane2 workstation which had 2 CPUs (400MHz MIPS-based), 1.5Gbytes of memory, and 3 elderly SCSI disks with a nice new Sun Ultra40 … 2 AMD Opteron 248s, 2Gbytes memory, and 2 mirrored SATA drives. It is interesting to compare the difference between an old-fashioned workstation originally designed in the middle to late 1990s with a 21st century PC. Not that I’m going to produce hard numbers from useful benchmarks … that is just too much work, and in some ways it is the feel of the differences that are important.

Of course this is not really a fair comparison. Whilst the SGI Octane is now very elderly and due to SGI managerial incompetence has not kept pace with PC performance as it should have done, it is after all a machine that originally cost 10-20 times the cost of the PC I am comparing it to. In car terms, I’m comparing a 20-year old Mercedes with a new and cheap Ford. I should point out that much of the software I am using is very much the same on both machines … the Enlightenment window manager, Sylpheed Claws as the mail client, Firefox as the browser, LyX as the word processor, and a text terminal for much of the remainder.

The PC is considerably quicker than the SGI of course. The graphic user interface is a good deal snappier, and most of the applications offer very welcome improvements in performance. With the exception of GIMP however, none of this performance increase is really essential; my old SGI ran pretty much everything my PC does, fast enough to get the job done. GIMP performance is the reason I upgraded, and here the difference is quite dramatic … filters that previous required patience now run almost instantly; when you are repeatedly trying things out in GIMP on quite large images this performance increase makes some things feasible that simply were not before.

There is one area where the SGI does offer some advantage over the PC; something I was expecting. The PCs disks are overall somewhat faster the the disks in the SGI (and of course I don’t have to pay to mirror my disks!), but the SGI tends to work more smoothly under high load. I’ve noticed before with the ‘low end’ on disks in PCs, that if you start to drive your disks very hard, the computer will sometimes stutter. Essentially the SGI was slower, but smoother under high disk load than the PC.

If was not for the need to run GIMP extensively (and the appeal of more standard add-on hardware like USB hard disks), there is no reason why I could not continue with the SGI. The tendency we have in the computing arena of replacing computers every few years is not a healthy one.

More Zonkyness

All Your Processors Belong To Us: Meltdown and Spectre