Apr 062024
 

Just came across someone today who wasn’t aware of the “BCC” (Blind Carbon Copy) header, and was wondering how an email reached her when her address wasn’t in the “To” header. It’s all too easy to laugh at people who somehow missed learning this stuff, but how often does email get taught these days?

Headers Are Just Comments

Well that heading is a bit of an exaggeration but it’s a helpful exaggeration. It is perhaps more accurate to say the headers are hints to the underlying software. There is a chain of software “under the hood” that takes the email you have composed in some kind of email client (which includes a web mail interface which is the most common way these days), formats it into a suitable format for a “mail transport agent” which then determines the “mail transport agent” is closest to the recipients and sends it there.

You -> Mail client -> Your MTA -> Recipient’s MTA

In terms of headers that you populate to instruct that chain where emails should go, there is :-

  • The “To” header which is what is most commonly used.
  • The “Cc” (“carbon copy” – an archaic reference) header which allows you to specify additional recipients, but it implies that the additional recipients are included as a courtesy (“You might want to see a copy of this for information.”).
  • The “Bcc” (“Blind carbon copy” ) header, which allows you to specify additional recipients but when your client transfers your email to the mail transfer agent it will add the recipients to the “envelope” (which we will explain shortly) but remove the header.

There are two reasons for using “Bcc”. One is basic politeness – if you are sending to a lot of addresses, the recipients will see that header and it can take up valuable screen real estate distracting from the content of the email. The second is security – if you are sending an email to lots of third-party contacts it may well be appropriate (and even required) to hide their addresses from each other. Not everyone wants their relationship with an STD clinic to be “public”!

The “Envelope”

When a client communicates with the mail transport agent, it will use something called SMTP (simple mail transport agent) which is very simplistic and the MTA does not look at the contents to determine anything (or rather it does not need to; some do especially if they do anti-virus scanning) :-

Connected to peach.
Escape character is '^]'.
220 zonky.org ESMTP Exim 24.12 Sat, 06 Apr 2024 09:57:50 +0100
helo pica
250 zonky.org Hello pica.zonky.org [2001:8b0:ca2c:dead::b000]
mail from:<some-forged-address@zonky.org>
250 OK
rcpt to:<address1@zonky.org>
250 Accepted
rcpt to:<address2@zonky.org>
250 Accepted
data
354 Enter message, ending with "." on a line by itself
The email appears here including mail headers
.
250 OK id=1rt1ts-0001k8-MM
quit
221 zonky.org closing connection

That is a forged SMTP transaction with certain details changed. The important bits are in bold which are what your mail client would use to communicate with the mail transport agent. As you can see they are simple enough to be “faked” by a person. There is a great deal of trust going on here – far too much for the modern age – but there are additional controls in place to make forging things somewhat harder than this would imply.

The key commands are as follows :-

  1. mail from:<some-forged-address@zonky.org>: This specifies the address the email is apparently from. Normally this would be a setting in your mail client (whether you can change this or not), but there is nothing here to stop you setting any address you want. Although there are almost always additional controls in place to make this harder.
  2. rcpt to:<address1@zonky.org>. This specifies what email address the email should go to. It is usually pulled from the headers you filled in whether that was the To, CC, or BCC headers. At this stage there is no difference. However you can put in addresses that don’t appear in the email at all.
  3. data. This is where your mail client copies the email that has been composed including all the headers. It will remove the “BCC” header and add some additional ones (such as “Date”). This body may or may not be examined by the mail transport agent; it isn’t necessary to send the email onwards.

So the mail transport agent now has the necessary information it needs to route your email to the required destinations – without looking inside the body. Which is analogous to a letter – the Royal Mail doesn’t open your letter to see where it needs to go, they will just use the address on the envelope.

And so we have the explanation for an email envelope – it is the addresses specified in the SMTP transaction allowing the mail transport agent to route email without looking at the contents. In normal circumstances the mail transport agent for the recipient will discard the envelope before it is placed in the recipient’s mailbox.

Opening The Envelope

Just like real post where you have to trust that nobody along the route between the original writer and the recipient will open the envelope to peruse the contents, the same applies to email. Which all the ‘agents’ along the path can normally be trusted, there is nothing to stop a rogue agent examining the contents of email – whether that’s a snoopy system administrator, an employer with an overly suspicious nature, or law enforcement.

Which explains why it is strongly advisable not to use email for anything secret; or to investigate encrypting emails.

Rusty Handrail
Rusty Handrail
Apr 012024
 

So I was reading 𝕏 and came across one of those memes showing “Chinese bots” making connections to “open” SSH ports to Internet accessible servers. The suggestion to turn off password authentication in favour of public/private key authentication was certainly a sensible suggestion (on a very simplistic level it effectively makes a very strong “password”).

But the “Chinese bots” thing sort of irritated me a bit, so I decided to trawl my personal firewall logs looking for attempts to connect to my ssh port(s). Even ignoring the IPv6 probes, there were 1251 different addresses probing my network (just one public IPv4 address) in the months of March so far.

Why is this irritating? Because the addresses of the machines attempting to break into a non-existent ssh service here are those of compromised machines. They may be in China, or the USA, Russia, etc. but that in no way betrays who is controlling those “bots”.

Anyway, for some data :-

CountCountry
502,US USA 840 United States
128,CN CHN 156 China
97,KR KOR 410 Korea, Republic of
33,SG SGP 702 Singapore
27,BG BGR 100 Bulgaria
26,RU RUS 643 Russian Federation
22,HK HKG 344 Hong Kong
22,GB GBR 826 United Kingdom
20,DE DEU 276 Germany
16,SE SWE 752 Sweden

And “China” isn’t even in the lead in this case! I have included just the top 10 as a long list of random countries with one or two robots isn’t very enlightening.

The key point here is that the national identity of the compromised host attacking tells you nothing about where the true attacker is from. Russia is quite a likely candidate given it’s status as a rogue nation with a known tolerance for cyber criminals (as long as they co-operate with the state when the state needs their skills), but that is just background knowledge.

Mar 102024
 

This is a collection of notes from my upgrade to an ASRock TRX50 WS motherboard fitted with an AMD Threadripper 7970X processor (32 cores) and 256Gbytes of memory. The upgrade meant that I retained the case, drives, graphics card, etc. from the previous system.

Most of the problems encountered were due to user stupidity.

First of all, whilst many of us have heard about the amount of time that DDR5 takes to “calibrate” itself, what I didn’t know was that the firmware status code shows “00” during this process (a dedicated “I’m messing with memory” code would be handy). And whilst it takes a while to do, if it takes longer than about 5m, then something else is wrong.

In my case it turned out that I hadn’t read the instructions properly and I hadn’t connected enough power connectors. To get it to work, I needed the usual 24-pin power connector, an 8-pin connector, and a 6-pin connector all connected on the “drive” side of the motherboard (opposite the side with the PCIe slots). Once that was sorted, the system was up and running.

The remaining notes relate to “tweaking”.

Booting Linux

Of course I use Linux, what the hell else would I use? FreeBSD? Well, that would be a good choice.

The biggest problem I had booting Linux was changing the netplan configuration to pick up the new network interfaces. In my case, the Marvell interface (the 10G one) came up as enp65s0 and the Realtek interface (the 2.5G one) as enp69s0. Because I’m bound to plug the cable into the wrong interface, I simply bonded the two interfaces together; the relevant section of my netplan configuration is as follows :-

network:
  version: 2
  renderer: networkd
  bonds:
    james:
      interfaces: [enp65s0, enp69s0]
  ethernets:
    enp65s0: {}
    enp69s0: {}
  bridges:

Yes, you can choose silly names here. And yes the bonding works fine – just now I swapped the cable over to the “right” NIC with numerous active network connections, and everything stayed alive.

Firmware Upgrade

The motherboard was supplied with version 6.04 of the firmware (I refuse to call this a “BIOS” because it just isn’t “basic” any more) whereas the latest was 7.09. The process is fairly simple :-

  1. Download the relevant firmware version from https://www.asrock.com/mb/AMD/TRX50%20WS/index.asp#BIOS.
  2. Save it to a FAT32 USB disk – I used a vfat formatted disk and I have a sneaking suspicion that exFAT will work too. The “Instant Flash” instructions by ASRock are obviously somewhat dated – it even mentions that saving to a floppy disk will work!
  3. Reboot the system and start the UEFI firmware. Select “Tools” and “Instant Flash”.
  4. Follow the on-screen instructions.

If you’re replacing a motherboard you won’t need detailed instructions here, but it is worth mentioning that the process takes a couple of reboots, and the second involves doing that memory calibration thing, so it takes an unusually long time to start.

I didn’t go to the effort to time the whole process, but my system went down at 18:04 and was back up at 18:15. So roughly 10 minutes.

SlimSAS

This isn’t currently 100% confident as I haven’t plugged anything in yet (ignoring a failed attempt when I assumed it work just work), but the SlimSAS ports can be configured for SATA mode in the firmware. Just go to Advanced, Chipset, go to the end of the list (which involves scrolling) past the settings for the PCIe slot configuration parameters and set :-

  1. SLIMSAS1 Mode: SATA
  2. SLIMSAS2 Mode: SATA

Firmware Settings

The following settings are what I chose to set based on a very quick session search Duckduckgo for explanations. The built-in documentation is somewhat lacking although there are URLs (encoded as QR codes) for more details. This is one area where firmware authors should pay more attention – even if they just hinted which settings work best for Windows, which work best for Linux, and which ones are for compatibility for older hardware.

The choices I’ve made may not be the best, but it seems to be working. Some of the explanations may be off, so I’d welcome corrections. All of these settings are found under the “Advanced” tab of the firmware page :-

CPU Configuration

  1. SMT: Or “hyperthreading”. It is possible some scientific computing workloads might work better with this turned off, but my recommendation is to leave it to “Auto”.
  2. CPB – Core performance boost: presumably allows one core to accelerate when other cores are idle. Left on “Auto”.
  3. Global C-State control: related to power-saving. There’s a suggestion that disabling this may result in extra stability. Disabled.
  4. Local APIC Mode: controls how the APIC appears to the operating system with choices of Auto, Compatible, xAPIC, or 2xAPIC. Supposedly 2xAPIC allows for greater efficiency on higher core counts. Set to 2xAPIC.
  5. L1 Stream HW Prefetcher: Enables or disabled pre-fetching memory into cache. Enabled.
  6. L2 Stream HW Prefetcher: Enables or disabled pre-fetching memory into cache. Enabled.
  7. SMEE (SME?): Secure memory (i.e. encrypted) for virtual machines. Not likely to make much difference in my case as I’m the exclusive owner of both the “host” and all of the virtual machines running on it. Left as “Auto”.
  8. SEV-ES ASID Space Limit Control: More on virtual machine security. Left on Auto.
  9. SVM mode: This option seemed to disappear on the upgrade to 7.09. If this does appear, enable it.
  10. ROM Armor: protection for SPI flash. Left as Enabled.

Chipset

  1. IOMMU: virtual machine I/O virtualisation to allow PCIe pass-through to a virtual machine. Enabled.
  2. ACS: More I/O virtualisation. Suggestions hinting at allowing PCIe←→PCIe transfers. Some hints at better IOMMU set up. Enabled.
  3. Enable AER Cap: PCIe error handling. Presumably disabling Linux AER error handling. Disabled.
  4. PCIe ARI Support: Enables support for ARI which allows a device to more easily support pretending to be multiple devices (so a graphics card could be shared amongst multiple virtual machines). Although card support for this is probably quite rare, I enabled it anyway.
  5. PCIe Ten Bit Tag Support: Allows a supporting device to use greater bandwidth and lower latency. Enabled.
  6. NUMA node(s) per socket: It is suggested that this allows the processor’s CCXes (the ‘core complex’ that appears as individual chiplets in an AMD processor) to operate as separate NUMA nodes. Set to NPS4.
  7. ACPI SRAC L3 Cache as NUMA domain: It is suggested that this also allows each CCX to function as a NUMA node. Enabled.
  8. TSME: Or Transparent SME. Support for SME is done by the firmware rather than the OS. Disabled.
  9. HPET: High Precision Timer. Enables support for a newer way of doing timing. Enabled.
  10. … (missing details because they weren’t of interest to me)
  11. SLIMSAS1 Mode/SLIMSAS2 Mode: As mentioned previously, allows switching the SlimSAS ports from supporting NVME devices to supporting SATA devices. Switched to SATA mode!

PCI

  1. PCI latency timer: How many clock cycles a 32-bit PCIe card can hang onto the bus for. Leave alone (32 cycles).
  2. PCI-X latency timer: How many clock cycles a 64-bit PCIe card can hang onto the bus for. Leave alone.
  3. VGA Palette Snoop: Whether to allow other cards to snoop on the VGA palette which is used by older cards for video encoding and the like. Disabled.
  4. PERR# Generation: Something to do with PCIe card errors. Left alone.
  5. SERR# Generation: Something to do with PCIe card errors. Left alone.
  6. Above 4G Decoding: Allows card to specify a 64-bit address to house their memory window. Enabled.
  7. Re-size BAR Support: Allows a card to negotiate a larger address window than the default of 256Mbytes. Enabled.
  8. SR-IOV Support: Where PCIe cards allow, enables the creation of virtual devices to be allocated to virtual machines. Enabled.
  9. BME DMA Mitigation: Re-enable Bus Master Attribute after SMM is locked. Whatever that means! Left disabled.
Feb 122024
 

So two days ago, I upgraded my main workstation to Ubuntu 23.10; a few little issues (mostly related to my own scripts), but nothing serious. Yet.

On the following day, my smart TV box started misbehaving. It couldn’t see any of the videos NFS mounted from my workstation, ITVX threw up a website error (this should have been a clue), but Youtube worked fine (which showed that the network was working fine).

So I did the obvious thing and started checking the NFS parameters to see if anything had changed. Nothing definite but on the way I noticed that the TV box wasn’t getting an IPv4 address from the dhcp server; IPv6 was working fine but some services don’t work on an IPv6 network.

I foolishly assumed that the TV box had stopped requesting addresses via dhcp – backed by the dhcp logs which showed no requests had been logged since the previous day. Set a static address, and everything sprang into life (except for ITVX who seem to have decided that only approved TV boxes should be allowed to run their code).

Later that same day, I upgraded a switch which failed to come back (“Failed to adopt”) which caused a daisy-chained wireless access point to disappear (“Failed to adopt”). And then a little while later, a second unconnected wireless access point also disappeared.

After a few reboots of the switch (and access points), I finally checked the dhcp server and found that its root filesystem had become ‘read-only’. But that wasn’t the end of the misdiagnosis …

I assumed that the SD card in my dhcp server (a tiny ARM box) was fried, so made arrangements to backup the contents, buy a couple of replacements, and try a spare (which was broken). After the spare turned out to be broken, I ran fsck on the root filesystem of the original and a whole bunch of errors were fixed.

Re-installed into the ARM box, and everything sprang to life again.

I guess the moral of the story is that you should check the basic services before diving into making assumptions.

Upended Cannon
Jan 142024
 

Just seen a video title about how Linux defeated UNIX™; it is quite hard to dispute this givennd that that Linux is alive, well, and thriving. But I would argue that it isn’t quite true.

First of all, UNIX™ is technically alive as Solaris, HP-UX and AIX are still active. And there may well be rarer versions out there – and I’m excluding operating systems that meet the trademark requirements but aren’t really “Unix” (we could argue all day about what is and what isn’t “Unix”).

But the market for UNIX™ machines is a great deal smaller than it used to be. And why is that? I would argue that whilst Linux made the transition easier, it isn’t the real reason why many organisations swapped out their high-priced machines for cheaper machines.

And that gives a bit of a clue. Whilst the high-priced machines from Sun, SGI, HP, IBM, Digital, etc. weren’t over-priced they were expensive. The hardware was built to be exceptionally reliable – for example some of the Suns I worked with could deal with a processor failure by simply turning off that processor and letting an engineer replace the board all whilst the system was up and running.

No what “killed” those expensive UNIX™ machines was virtualisation and the use of commodity hardware. If a modern server dies, the virtual servers running on it are simply migrated to a working server suffering at worst a reboot (but probably not).

Plus there was a realisation that not everything needed to be continually available.

Through The Gateway