ARM – More Zonkyness

All! About! Apple’s! New! M1X! Processor!

Oct 172021

All the details of Apple’s M1X processor! Here first! Before Apple has announced it!

Yes this is just click-bait. Just like all the other videos/posts/stories or whatever about Apple’s soon-to-be-announced (perhaps) ARM-based processor that supposedly scales up more than the original M1. It’s going to happen sometime (which is probably on the 18th – tomorrow at the time of writing).

So where does all the early speculation come from? In all probability it is nothing more than speculation and click-bait. You’re wasting your time clicking on this story no less and no more than checking out the others.

Except this one is honest about it being just click-bait.

Apple is ARMing Itself

Computing No Responses »

Jun 272020

So Apple has announced that it is replacing Intel processors with ARM processors in its Mac machines. And as a result we’re going to be plagued with awful puns endlessly until we get bored of the discussion. Sorry about that!

This is hardly unexpected – Apple has been using ARM-based processors in its iThingies for years now, and this is not the first time they have changed processor architectures for the Mac. Apple started with the Motorola 68000, switched to the Motorola/IBM Power architecture, and then switched to Intel processors.

So they have a history of changing processor architectures, and know how to do it. We remember the problems, but it is actually quite an accomplishment to take a macOS binary compiled for the Power architecture and run it on an Intel processor. It is analogous to taking a monolingual Spanish speaker, providing them with a smartphone based translator and dropping them into an English city.

So running Intel binary macOS applications on an ARM-based system will usually work. They’ll be corner cases that do not of course, but these are likely to be relatively rare.

But what about performance? On a theoretical level, emulating a different processor architecture is always going to be slower, but in practice you probably won’t notice.

First of all, most macOS applications very often consist of a relatively small wrapper around Apple-provided libraries of code (although that “wrapper” is the important bit). For example, the user interface of any application is going to be mostly Apple code provided by the base operating system – so the user interface is going to feel as snappy as any native ARM macOS application.

Secondly, Apple knows that the performance of macOS applications originally compiled for Intel is important and has Rosetta 2 to “translate” applications into instructions for the ARM processors. This will probably work better than the doom-sayers expect, but it will never be as fast as natively compiled code.

But it will be good enough especially as most major applications will be made ARM natively relatively quickly.

But there is another aspect of performance – are the ARM processors fast enough compared with the Intel processors? Well, the world’s fastest supercomputer runs on the ARM processors, although Intel fanboys will quite rightly point out that a supercomputer is a special case and that a single Intel core will outperform a single ARM core.

Except that with the exception of games, and specialised applications that have not been optimised for parallel processing, more cores beats faster single cores.

And a single ARM core will beat a single Intel core if the later is thermally throttled. And thermals has been holding back the performance of Apple laptops for quite a while now.

Lastly, Apple knows that ARM processors are slower than Intel processors in single core performance and is likely pushing ARM and themselves to solve this. It isn’t rocket science (if anything it’s thermals), and both have likely been working on this problem in the background for a while.

Most of us don’t really need ultimate processor speed; for most tasks merely the appearance of speed is sufficient – web pages loading snappily, videos playing silkily, etc.

Ultimately if you happen to run some heavy-processing application (you will know if you do) whose performance is critical to your work, benchmark it. And keep benchmarking it if the ARM-based performance isn’t all that good to start with.

And most of these tasks can be performed fine with a relatively modest modern processor and/or can be accelerated with specialised “co-processors”. For example, Apple’s Mac Pro has an optional accelerator card that offloads video encoding and makes it much faster than it would otherwise be.

Apple has a “slide” :-

That implies that their “Apple silicon” processors will contain not just the ordinary processor cores but also specialised accelerators to improve performance.

All Your Processors Belong To Us: Meltdown and Spectre

Computing, Security No Responses »

Jan 042018

Well, there’s another big and bad security vulnerability; actually there are three. These are known as Meltdown and Spectre (two different Spectres). There are all sorts of bits of information and misinformation out there at the moment and this posting will be no different.

In short, nobody but those involved in the vulnerability research or implementing work-arounds within the wel-known operating systems really knows these vulnerabilities well enough to say anything about them with complete accuracy.

The problem is that both vulnerabilities are exceptionally technical and require detailed knowledge of technicalities that most people are not familiar with. Even people who work in the IT industry.

Having said that I’m not likely to be 100% accurate, let’s dive in …

What Is Vulnerable?

For Meltdown, every modern Intel processor is vulnerable; in fact the only processors from Intel that are not vulnerable are only likely to be encountered in retro-computing. Processors from AMD and ARM are probably not vulnerable, although it is possible to configure at least one AMD processor in such a way that it becomes vulnerable.

It appears that that more processors are likely to be vulnerable to the Spectre vulnerabilities. Exactly what is vulnerable is a bit of work to assess, and people are concentrating on the Meltdown vulnerability as it is more serious (although Spectre is itself serious enough to qualify for a catchy code name).

What Is The Fix?

Replace the processor. But wait until fixed ones have been produced.

However there is a work-around for the Meltdown vulnerability, which is an operating system patch (to fix the operating system) and a firmware patch (to fix the UEFI environment). All of the patches “fix” the problem by removing kernel memory from the user memory map, which stops user processes exploiting Meltdown to read kernel memory.

Unfortunately there is a performance hit with this fix; every time you call the operating system (actually the kernel) to perform something, the memory map needs to be loaded with the kernel maps and re-loaded with the old map when the routine exits.

This “costs” between 5% and 30% when performing system calls. With very modern processors the performance hit will be consistently 5% and with older processors the hit will be consistently 30%.

Having said that, this only happens when calling the operating system kernel, and many applications may very well make relatively few kernel operating system calls in which case the performance hit will be barely noticeable. Nobody is entirely sure what the performance hit will be for real world use, but the best guesses say that most desktop applications will be fine with occasional exceptions (and the web browser is likely to be one); the big performance hit will be on the server.

How Serious Are They?

Meltdown is very serious not only because it allows a user process to read privileged data, but because it allows an attacker to effectively remove a standard attack mitigation which makes many older-style attacks impracticable. Essentially it make older-style attacks practicable again.

Although Spectre is still serious, it may be less so than Meltdown because an attacker needs to be able to control some data that the victim process uses to indulge in some speculative execution. In the case of browsers (for example) this is relatively easy, but in general it is not so easy.

It is also easier to fix and/or protect against on an individual application basis – expect browser patches shortly.

Some Technicalities

Within this section I will attempt to explain some of the technical aspects of the vulnerabilities. By all means skip to the summary if you wish.

The Processor?

Normally security vulnerabilities are found within software – the operating system, or a ‘layered product’ – something installed on top of the operating system such as an application, a helper application, or a run-time environment.

Less often we hear of vulnerabilities that involve hardware in some sense – requiring firmware updates to either the system itself, graphics cards, or network cards.

Similar to firmware updates, it is possible for microcode updates to fix problems with the processor’s instructions.

Unfortunately these vulnerabilities are not found within the processor instructions, but in the way that the processor executes those instructions. And no microcode update can fix this problem (although it is possible to weaken the side-channel attack by making the cache instructions execute in a fixed time).

Essentially the processor hardware needs to be re-designed and new processors released to fix this problem – you need a new processor. The patches for Meltdown and Spectre – both the ones available today, and those available in the future – are strictly speaking workarounds.

The Kernel and Address Space

Meldown specifically targets the kernel and the kernel’s memory. But what is the kernel?

It is a quite common term in the Linux community, but every single mainstream has the same split between kernel mode and user mode. Kernel mode has privileged access to the hardware whereas user mode is prevented from accessing the hardware and indeed the memory of any other user process running. It would be easy to think of this as the operating system and user applications, but that would be technically incorrect.

Whilst the kernel is the operating system, plenty of software that runs in user mode is also part of the operating system. But the over-simplification will do because it contains a useful element of the truth.

Amongst other things the kernel address space contains many secrets that user mode software should not have access to. So why is the kernel mode address space overlaid upon the user mode address space?

One of the jobs that the kernel does when it starts a user mode process, is give to that process a virtual view of the processor’s memory that entirely fills the processor’s memory addressing capability – even if that it is more memory than the machine contains. The reasons for this can be ignored for the moment.

If real memory is allocated to a user process, it can be seen and used by that process and no other.

For performance reasons, the kernel includes it’s own memory within each user process (but protected). It isn’t necessary, but re-programming the memory management unit to map the kernel memory for each system call is slower than not. And after all, memory protection should stop user processes reading kernel memory directly.

That is of course unless memory protection is broken …

Speculative Execution

Computer memory is much slower than modern processors which is why we have cache memory – indeed multiple levels of cache memory. To improve performance processors have long been doing things that come under the umbrella of ‘speculative execution’.

If for example we have the following sample of pseudo-code :-

load variable A from memory location A-in-memory
if A is zero
then
do one thing
else
do another
endif

Because memory is so slow, a processor running this code could stop whilst it is waiting for the memory location to be read. This is how processors of old worked, and is often how processor execution is taught - the next step starts getting really weird.

However it could also execute the code assuming that A will be zero (or not, or even both), so it has the results ready for once the memory has been read. Now there are some obvious limitations to this - the processor can't turn your screen green assuming that A is zero, but it can sometimes get some useful work done.

The problem (with both Meltdown and Spectre) is that speculative execution seems to bypass the various forms of memory protection. Now whilst the speculative results are ignored once the memory is properly read, and the memory protection kicks in, there is a side-channel attack that allows some of the details of the speculative results to be sniffed by an attacker.

Summary

Don't panic! These attacks are not currently in use and because of the complexity it will take some time for the attacks to appear in the wild.
Intel processors are vulnerable to Meltdown, and will need a patch to apply a work-around. Apply the patch as soon as it comes out even if it hurts performance.
The performance hit is likely to be significant only on a small set of applications, and in general only significant on a macro-scale - if you run as many servers as Google, you will have to buy more servers soon.
Things are a little more vague with Spectre, but it seems likely that individual applications will need to be patched to remove their vulnerability. Expect more patches.

Tunnel To The Old Town

The Inane Fuss Over Apple’s Secret Port To The ARM

Computing No Responses »

Feb 082012

If you read certain articles on the web you might be under the impression that Apple has had a secret project to port OSX to the ARM-based architecture with the intention of producing a cut down (although not necessarily very cut down) Macbook Air running on the ARM architecture.

Which is preposterous.

Firstly this secret project to port OSX was merely bringing up the ‘lower half’ of OSX (Darwin) on a particular variety of ARM-processor. The end result ? Probably something more or less equivalent to a “login” prompt on an old multi-user Unix system with no GUI. That is not to underestimate the accomplishment of the student involved – in many ways that would be a good 75% of the work involved.

But a few key facts here :-

This is not the first port of Darwin (or even OSX) to the ARM-based architecture. Pick up your iThingie … that’s got an ARM inside, and whilst we all call the operating system it runs iOS, it is really OSX with a different skin on. Sure there are some differences and limitations, but they are merely skin deep – at the lowest level they’re both Darwin.
If there’s a secret project to run OSX on an ARM-based laptop of some kind, this ain’t it. Take a closer look at the processor used in this experiment. It’s an ARM processor less capable than that in the very first iPhone. You won’t see it in any new laptops. If this secret experiment had any real product behind it, it would be more likely to be an intelligent embedded device – a really clever fridge or something (and no, not a TV).
If there was a real product behind this, it seems pretty unlikely that Apple would choose a student on work experience to do the work. After all such a student might just spill the beans on a secret project given enough green folding stuff as incentive.

What is probably the case here is that Apple came up with this project for the student as a way of testing whether he was worth considering as a full employee – after all it is a better way of testing a potential employee than asking them to make the tea! And have no intention of using the result as a product.

What they will do however is use the student’s observations to feed back into the OSX team – what problems did he encounter that might qualify as bugs ? Etc.

In reality, Apple probably already has OSX running on ARM based machines in their labs. It’s an obvious thing to try out given that all their iThingies are ARM based, and it is not an enormous amount of extra work to finish off what is already in place to get something that looks and runs like OSX. After all, Apple did ages ago admit that early versions of OSX did run on x86-based processors when their product line was all PowerPC based, and keeping OSX portable across architectures is something they probably want to keep as a possibility.

Will Apple launch an ARM-based Macbook Air ? Not anytime soon. Whilst the value of a 64-bit architecture is over-rated, it would seem unlikely that Apple will ever again launch a 32-bit based “real” computer. But with 64-bit based ARMs arriving in a year or two, who knows ?

The ARM Servers Are Coming!

Computing No Responses »

Oct 312011

According to a couple of articles on The Register, a couple of manufacturers are getting close to releasing ARM-based servers. The interesting thing is that the latest announcement includes details of a 64-bit version of the ARM processor, which according to some people is a precondition for using the ARM in a server.

It is not really true of course, but a 64-bit ARM will make a small number of tasks possible. It is easy to forget that 32-bit servers (of which there are still quite a few older ones around) did a pretty reasonable job whilst they were in service – there is very little that a 64-bit server can do that a 32-bit server cannot. As a concrete example, a rather elderly SPARC-based server I have access to has 8Gbytes of memory available, is running a 64-bit version of Solaris (it’s hard to find a 32-bit version of Solaris for SPARC), but of the 170 processes it is running, none occupies more than 256Mbytes of memory; by coincidence the size of processes is also no more than 256Mb.

The more important development is the introduction of virtualisation support.

The thing is that people – especially those used to the x86 world – tend to over-emphasise the importance of 64-bits. It is important as some applications do require more than 4Gbytes of memory to support – in particular applications such as large Oracle (or other DBMS) installations. But the overwhelming majority of applications actually suffer a performance penalty if re-compiled to run as 64-bit applications.

The simple fact is that if an application is perfectly happy to run as a 32-bit application with a “limited” memory space and smaller integers, it can run faster because there is less data flying around. And indeed as pointed out in the comments section of the article above, it can also use ever so slightly more electricity.

What is overlooked amongst those whose thinking is dominated by the x86 world, is that the x86-64 architecture offers two benefits over the old x86 world – a 64-bit architecture and an improved architecture with many more CPU registered. This allows for 64-bit applications in the x86 world to perform better than their 32-bit counterparts even if the applications wouldn’t normally benefit from running on a 64-bit architecture.

If the people producing operating systems for the new ARM-based servers have any sense, they will quietly create a 64-bit operating system that can transparently run many applications in 32-bit mode. Not exactly a new thing as this is what Solaris has done on 64-bit SPARC based machines for a decade. This will allow those applications that don’t require 64-bit, to gain the performance benefit of running 32-bit, whilst allowing those applications that require 64-bit to run perfectly well.

There is no real downside in running a dual word sized operating system except a minor amount of added complexity for those developers working at the C-language level.

Older Entries