No ads? Contribute with BitCoins: 16hQid2ddoCwHDWN9NdSnARAfdXc2Shnoa
Aug 152014
 

This post is likely to change frequently after it first appears as experiments/research/etc. occurs to me.

To get to grips with it, let’s first define the software that is in control of your PC when it first starts as the system firmware rather than the BIOS, as system firmware is a generic term which can refer to BIOS, EFI, UEFI, OpenFirmware, or anything else that someone comes up with.

UEFI (and the older EFI) is a replacement for the legacy BIOS that we’ve been stuck with for decades. Despite advances in almost every aspect of the “PC standard”, the BIOS has hardly advanced at all. To be fair, the user interface has gotten a bit less 1980s, and it of course deals with hardware devices that weren’t even imagined decades ago. But there are still some rather nasty limitations, which UEFI is supposed to resolve.

Of course there are those who claim that UEFI is a complete mess with no redeeming qualities, but the truth is probably somewhere between it being that and the neatest bit of system firmware ever invented.

This posting are my working notes (with the addition of a bit of pointless waffling) on UEFI, given that I’m likely to be found staring at a UEFI shell on a broken server at some point in the future. All experiments (so far) have been carried out with a VirtualBox-powered virtual machine (with UEFI turned on) running Ubuntu server.

Installation

The boot process looks a little different … unsurprisingly. And there’s an error in relation to missing UEFI stuff from the hard disk. But once the CD is booted the process looks the same …

Partitioning: Initially “Guided – use entire disk and set up LVM”

Doesn’t give an opportunity to review the partitioning, but :-

  1. EFISystem (/boot/efi) is roughly 512Mbytes
  2. /boot is roughly 256Mbytes
  3. LVM Volume Group

The Ubuntu installation uses grub as the boot loader, although it’s hardly the only option and there are hints that grub has been known to have issues with EFI. Although this could be “early adopters pain”, and not applicable currently.

GPT Partitions

See the Wikipedia article, but basically GPT replaced the old Master Boot Record partitions. The main advantage is that there are fewer dumb limits with GPT – there is no limit of 4 primary partitions (and no hacks to support extended partitions), but instead a minimum maximum of 128 partitions, which basically translates as the required minimum size of a partition table allows for up to 128 partitions but the partition table can be bigger.  More than you’re likely to need anyway.

Cleverly (if you want to put it that way), GPT can live alongside the old MBR partitions as the GPT starts at block 1 rather than block 0. This has been used by Apple to keep two partition tables so that OSX can use GPT whilst Windows still uses MBR.

Keeping two partition tables in sync is perhaps not the best idea for stability and given the need for backwards compatibility has only a limited useful lifetime, I’d rather live without it. In fact it would be nice if I could fill up the old MBR with stuff that told all MBR tools not to mess around with the partition tables.

In theory, (U)EFI and GPT are independent of each other, but in practice, GPT implies booting from UEFI and MBR implies booting from BIOS (some UEFI implementation switch to a BIOS-compatibility mode when they see an MBR).

The EFI System Partition

(U)EFI requires that to boot, a disk must have an EFI System Partition formatted as FAT. This is used as effectively a replacement for the BIOS boot method of loading code from block 0. This file system is usually mounted under Linux as /boot/efi and contains various files that allow Linux to be booted (more details to be added).

It is perhaps a shame that the EFI standards people didn’t suggest making the EFI system partition part of the on-board firmware. It wouldn’t be impossible (or very expensive) to incorporate a small writeable FAT file system into the motherboard to avoid the need for the EFI partition on one of the storage disks. It is not as if the EFI system partition needs to be very big – Ubuntu configured one as 512Mbytes in size which is vastly larger than required for what is actually installed which adds up to less than 4Mbytes.

EFIBOOTMGR

The tool efibootmgr is for interacting with the EFI boot manager and the EFI variables that control what gets booted in which order :-

# efibootmgr
BootCurrent: 0003
BootOrder: 0003,0000,0001,0002
Boot0000* EFI DVD/CDROM
Boot0001* EFI Hard Drive
Boot0002* EFI Internal Shell
Boot0003* ubuntu

The command has plenty of other options …

You can set the boot order with: efibootmgr -o 0002,0003,0001,0000. So I’ve set the preferred boot order to include the internal shell first … the purpose here is to look into EFI after all.

I also somehow managed to erase the “ubuntu” so re-created that with: efibootmgr -c -L UbuntuServer -l “\EFI\ubuntu\shimx64.efi” (yes unfortunately they use the wrong path separator).

EFI Shell

Is surprisingly limited. And rather too DOS-like for me.

Command Description
help Displays a list of the commands available … which very unhelpfully doesn’t pause at the end of the screen. Can also display additional details of a command if you try help command.
mode Displays a list of the available screen mode commands.
mode x y Sets the screen mode to the size specified (as listed with mode).
cls Clears the screen.
cls ${n} Clears the screen and sets the colours to a set specified by the number (0-7). Don’t bother; most of the choices are nasty.
map Displays teh mapping table showing the block devices (BLK${n}) and the recognised file systems (FS${n}). If you don’t have fs0 you’ve got problems!
fs0: Sets the specified file system as the current file system. This will change the prompt appropriately.
cd ${directory} Changes to the specified directory. Remember that the path separator is the DOS preferred character (\).
ls Lists files in the current directory.
edit ${filename} Edits the specified file (^S saves, ^Q quits), but don’t try editing files ending in .efi!
${filename}.efi Runs the EFI binary from the current directory, and yes that does mean you can boot Ubuntu Server by browsing to the \EFI\ubuntu directory and entering shimx64.efi (as I discovered after breaking the ubuntu boot option).
Aug 022014
 

One of the questions I always ask myself when setting up a resilient server, is just how well will it cope with a disk failure? Ultimately you cannot answer that without trying it out.

But as practice (and to determine whether it mostly works), it’s perfectly sensible to try it out on a virtual machine.

Debian Installation

If you are looking for full instructions on installing Debian, this is not the place to look. I configured the virtual machine with 2GBytes of memory, an LsiLogic SAS controller with two attached disks each of 64GBytes.

The installation process was much as per normal (I unselected “Desktop” to save time), but the storage was somewhat different :-

  • Manual partitioning method
  • Create an empty partition on both disks
  • Select Software RAID
  • Create an MD device
  • RAID1
  • And put both disks into the RAID
  • Configure LVM
  • Create a Volume Group (“sys”)
  • Select md0 for the volume group device
  • Create logical volumes (boot: 512MB, root: 16GB, var: 8GB, home: 512M (it’s a server))
  • In the partitioning manager select each Logical Volume in turn and specify the file system parameters.

You will notice that no swap was created – this was a mistake that I’m in the unfortunate habit of making! However for a test, it wasn’t a problem and with LVM it is possible to create swap after the installation.

Post Installation

After the server has booted, it is possible to check the second hard disk for the presence of grub in the MBR (dd if=/dev/sdb of=/var/tmp/sdb.boot bs=1M count=1, and then run strings on the result). It turns out that nothing is installed in the MBR of the second disk by default. Which would make booting in a degraded environment an interesting challenge (i.e. you’ll have to find a rescue CD and boot off the relevant hard disk).

However this can be fixed by installing grub onto the second hard disk: grub-install /dev/sdb

Testing Resilience

But what happens when you lose a disk? Now is the time to test. Shut down the virtual machine and remove the second hard disk – leaving the first hard disk in place does not provide a full test.

If your first attempt at booting afterwards results in a failure to acquire a grub menu, then either you have failed to run grub-install as detailed above (guess what mistake I made?), or your BIOS settings don’t permit the computer to boot off anything other than the first hard disk.

However, in my second attempt, the server booted normally with the addition of a few messages that indicate that there is just one disk making up the mirrored pair.

Summary

  1. Yes, you can put /boot onto an LVM file system that sits on mirrored disks. That hasn’t always been the case.
  2. It is still necessary to run grub-install to put Grub onto the MBR of the second hard disk.
  3. It works.
Jun 262014
 

Came across a hint today about reporting on ECC memory errors. For those who do not know, ECC memory detects memory errors and corrects correctable errors. Normal memory (as found in almost all laptops and desktops) simply ignores the errors and lets them accumulate and cause problems either with data corruption or by causing software errors.

As I happen to have ECC memory in my desktop machine I thought I would have a look into the hint. Turns out that Linux does not report on ECC events automatically; you need to install the relevant EDAC (Error Detection and Correction) tools. Which for Debian, turns out to be pretty simple :-

# apt-get install edac-utils

As part of the installation process, a daemon process is started. But for whatever reason, it didn’t automatically detect what driver to load. So I edited /etc/default/edac and added :-

EDAC_DRIVER=amd64_edac_mod

Once that is done, a simple /etc/init.d/edac restart loads the driver and starts monitoring. Messages should appear in your log files (/var/log/messages) and reports can be displayed with edac-util :-

# edac-util --report=full 
mc0:csrow0:mc#0csrow#0channel#0:CE:0
mc0:csrow0:mc#0csrow#0channel#1:CE:0
mc0:csrow1:mc#0csrow#1channel#0:CE:0
mc0:csrow1:mc#0csrow#1channel#1:CE:0
mc0:csrow2:mc#0csrow#2channel#0:CE:0
mc0:csrow2:mc#0csrow#2channel#1:CE:0
mc0:csrow3:mc#0csrow#3channel#0:CE:0
mc0:csrow3:mc#0csrow#3channel#1:CE:0
mc0:noinfo:all:UE:0
mc0:noinfo:all:CE:0

Of course memory errors are relatively rare (or at least should be) so it may take months before any error is reported.

Jan 162014
 

This is not original work, but merely a set of notes on how to do the set up. The core information (and the code) came from this blog posting.

Essentially I’ve re-ordered the steps in which to work and excluded anything other than the bare essentials to get it all working. With the intention I can get my missile launcher working at home and at work  😎

Step 1 is to prevent the HID driver from clamping on to the missile launcher. This was done by :-

  1. Editing /etc/default/grub and adding usbhid.quirks=0x2123:0x1010:0x04 to the existing variable GRUB_CMDLINE_LINUX_DEFAULT.
  2. Run update-grub (I always manage to forget this).
  3. Reboot the machine and check /var/log/messages for 2123 (the VendorID) to see if it has been claimed by usbhid (which will show up as a line beginning generic-usb 0003:2123:1010.0006: hiddev0 if it does claim it).

The next step is to download and compile the code given in the blog link above. If you need instructions on how to do this, then you probably need to look elsewhere – it builds easily.

Once built, an sudo insmod launcher_driver.ko will verify that the kernel module loads – you can double check by looking at /var/log/messages.

It’s also necessary to install both the kernel driver and the control program manually :-

  1. Copy the compiled kernel module to /lib/modulessudo cp  launcher_driver.ko /lib/modules
  2. Edit /etc/rc.local and add the command: /sbin/insmod /lib/modules/launcher_driver.ko
  3. Copy the control program to a sensible location: sudo install launcher_control /opt/bin

There’s probably better ways of doing this, and better places to stick things but as you’re following my instructions you’re stuck with my suggestions! It’s tempting to try a reboot at this stage to verify that this works, but as there’s just one small extra step we may as well get that done too. This is to create a udev rule to set up a device file in /dev.

Create a file (/etc/udev/rules.d/99-usb-launcher.rules) with the following contents :-

KERNEL=="launcher?*",MODE="0660",GROUP="cdrom"

The choice of group name is rather inappropriate except it will work well enough, and I have changed the permissions on this to something a little more restrictive. This can be tested with sudo udevadm trigger which will re-run udev. This should change the permissions on any existing /dev/launcher* file(s). If it doesn’t work, the blog pointer above is the place to head.

Lastly, there’s a couple of corrections to the launcher_control.c that is convenient to make :-

% diff launcher_control.c launcher_control.c.orig
63c63
<         while ((c = getopt(argc, argv, "m:lrudfsht:")) != -1) {
---
>         while ((c = getopt(argc, argv, "mlrudfsht:")) != -1) {
97,98c97
< 		fprintf(stderr, "Couldn't open file: %s\n", dev);
<                 /*perror("Couldn't open file: %m");*/
---
>                 perror("Couldn't open file: %m");

 

Oct 232013
 

Crazy experiment time. What happens when you have a disk with 100 partitions? The replacement for the old MBR standard for partitions on PC hardware is slowly being replaced with GUID partitions. The later increased the maximum number of partitions to 128 which is probably far more than anyone needs, but what happens when you have a disk with 100 partitions?

As it happens, I had a spare external drive to play with, so set something up :-

for x in {1..99}     
do
  parted /dev/sdc mkpart FAT $(($x * 100)) $((x * 100 + 99))
  mkfs -t vfat /dev/sdc${x} 
done

This took a surprising amount of time to run with two interesting effects :-

  1. The mkfs tool refused to make a filesystem on /dev/sdc16 and /dev/sdc80 as it claimed it would be creating a filesystem on a full disk device. I suspect that this is a bug due to simplistic assumption of what constitutes a full disk device based on minor device numbers (/dev/sdc16 happened to be 0 and /dev/sdc80 happened to be 64). This could probably be solved by using device nodes within /dev/disk/by-${something}/${whatever}.
  2. The Unity Launcher appeared to attempt to populate itself with the new filesystems as they were being created, but very rapidly decided not to bother. This happened several times.

Once the creation process was complete, I reconnected the external drive to my Ubuntu machine, and yes the launcher does contain a ton of hard disk icons. The launcher is still full functional, but having a hundred (or so) devices below the normal icons does make using it a little clumsy.

Fortunately it did not mount all the filesystems automatically – closing that many windows would be very tedious. Mounting them all via a file manager window was pretty tedious, but it worked :-

/dev/sdc56             95M     0   95M   0% /media/mike/8663-39C5
/dev/sdc65             95M     0   95M   0% /media/mike/8673-0919
/dev/sdc71             95M     0   95M   0% /media/mike/873E-FEE7
/dev/sdc72             95M     0   95M   0% /media/mike/8741-47B3
/dev/sdc79             95M     0   95M   0% /media/mike/874D-4B53
/dev/sdc81             95M     0   95M   0% /media/mike/874E-D280
/dev/sdc82             95M     0   95M   0% /media/mike/8752-1ACE
/dev/sdc83             95M     0   95M   0% /media/mike/8754-2562
/dev/sdc84             95M     0   95M   0% /media/mike/8755-D262
/dev/sdc86             95M     0   95M   0% /media/mike/8759-0D82
/dev/sdc87             95M     0   95M   0% /media/mike/875A-E5C5
/dev/sdc89             95M     0   95M   0% /media/mike/875E-035B
/dev/sdc92             95M     0   95M   0% /media/mike/8763-8FB5
/dev/sdc93             95M     0   95M   0% /media/mike/8765-7A2F
/dev/sdc94             95M     0   95M   0% /media/mike/8767-1DBC
/dev/sdc95             95M     0   95M   0% /media/mike/8768-D314
/dev/sdc96             95M     0   95M   0% /media/mike/876A-A46E
/dev/sdc97             95M     0   95M   0% /media/mike/876B-F064
/dev/sdc98             95M     0   95M   0% /media/mike/876D-9D90
/dev/sdc58             95M     0   95M   0% /media/mike/8666-B9AA
/dev/sdc61             94M     0   94M   0% /media/mike/866B-8EFA
/dev/sdc62             95M     0   95M   0% /media/mike/866D-1726
/dev/sdc64             95M     0   95M   0% /media/mike/8671-5EE1
/dev/sdc66             95M     0   95M   0% /media/mike/8736-C2F5
/dev/sdc67             95M     0   95M   0% /media/mike/8737-EE95
/dev/sdc68             95M     0   95M   0% /media/mike/8739-7213
/dev/sdc69             94M     0   94M   0% /media/mike/873B-181F
/dev/sdc70             95M     0   95M   0% /media/mike/873C-E80C
/dev/sdc73             95M     0   95M   0% /media/mike/8743-11E7
/dev/sdc74             95M     0   95M   0% /media/mike/8745-28A8
/dev/sdc75             95M     0   95M   0% /media/mike/8746-CA94
/dev/sdc77             95M     0   95M   0% /media/mike/874A-1D30
/dev/sdc78             95M     0   95M   0% /media/mike/874B-C1C7
/dev/sdc85             95M     0   95M   0% /media/mike/8757-77A0
/dev/sdc88             94M     0   94M   0% /media/mike/875C-6DF9
/dev/sdc90             95M     0   95M   0% /media/mike/8760-8FD5
/dev/sdc91             94M     0   94M   0% /media/mike/8762-01DA
/dev/sdc99             94M     0   94M   0% /media/mike/8770-0F74
/dev/sdc1              93M     0   93M   0% /media/mike/8609-229A
/dev/sdc17             95M     0   95M   0% /media/mike/8621-921D
/dev/sdc21             95M     0   95M   0% /media/mike/8628-8CDB
/dev/sdc22             95M     0   95M   0% /media/mike/862A-2217
/dev/sdc23             94M     0   94M   0% /media/mike/862B-8EF9
/dev/sdc25             95M     0   95M   0% /media/mike/862F-0BE5
/dev/sdc27             95M     0   95M   0% /media/mike/8633-1F9D
/dev/sdc28             95M     0   95M   0% /media/mike/8634-A26F
/dev/sdc34             95M     0   95M   0% /media/mike/863E-14EB
/dev/sdc37             95M     0   95M   0% /media/mike/8643-1F63
/dev/sdc4              95M     0   95M   0% /media/mike/860D-2753
/dev/sdc40             95M     0   95M   0% /media/mike/8647-8E49
/dev/sdc41             95M     0   95M   0% /media/mike/8649-033D
/dev/sdc42             94M     0   94M   0% /media/mike/864A-A12A
/dev/sdc43             95M     0   95M   0% /media/mike/864C-6EEF
/dev/sdc44             95M     0   95M   0% /media/mike/864E-3469
/dev/sdc45             95M     0   95M   0% /media/mike/8650-8796
/dev/sdc46             95M     0   95M   0% /media/mike/8652-64DF
/dev/sdc47             95M     0   95M   0% /media/mike/8653-F743
/dev/sdc48             95M     0   95M   0% /media/mike/8655-B14B
/dev/sdc49             95M     0   95M   0% /media/mike/8657-34FF
/dev/sdc5              95M     0   95M   0% /media/mike/860E-EBD7
/dev/sdc50             94M     0   94M   0% /media/mike/8658-A04A
/dev/sdc51             95M     0   95M   0% /media/mike/865A-D4D3
/dev/sdc52             95M     0   95M   0% /media/mike/865C-33D1
/dev/sdc53             95M     0   95M   0% /media/mike/865D-FA56
/dev/sdc54             95M     0   95M   0% /media/mike/8660-6C95
/dev/sdc55             95M     0   95M   0% /media/mike/8661-D456
/dev/sdc57             95M     0   95M   0% /media/mike/8665-0AFD
/dev/sdc59             95M     0   95M   0% /media/mike/8668-3D53
/dev/sdc6              95M     0   95M   0% /media/mike/8610-F9B0
/dev/sdc60             95M     0   95M   0% /media/mike/866A-0A0E
/dev/sdc63             95M     0   95M   0% /media/mike/866E-F6E7
/dev/sdc76             95M     0   95M   0% /media/mike/8748-8D02
/dev/sdc10             95M     0   95M   0% /media/mike/8616-B29F
/dev/sdc11             95M     0   95M   0% /media/mike/8618-6462
/dev/sdc12             94M     0   94M   0% /media/mike/861A-5208
/dev/sdc13             95M     0   95M   0% /media/mike/861B-BA6E
/dev/sdc14             95M     0   95M   0% /media/mike/861D-5133
/dev/sdc15             95M     0   95M   0% /media/mike/861E-C384
/dev/sdc18             95M     0   95M   0% /media/mike/8623-BFCF
/dev/sdc19             95M     0   95M   0% /media/mike/8625-9D85
/dev/sdc2              95M     0   95M   0% /media/mike/860A-504E
/dev/sdc20             94M     0   94M   0% /media/mike/8627-1391
/dev/sdc24             95M     0   95M   0% /media/mike/862D-457F
/dev/sdc26             95M     0   95M   0% /media/mike/8631-5F8A
/dev/sdc29             95M     0   95M   0% /media/mike/8636-2F58
/dev/sdc3              95M     0   95M   0% /media/mike/860B-8C77
/dev/sdc30             95M     0   95M   0% /media/mike/8637-F726
/dev/sdc31             94M     0   94M   0% /media/mike/8639-6B19
/dev/sdc32             95M     0   95M   0% /media/mike/863A-FBBC
/dev/sdc33             95M     0   95M   0% /media/mike/863C-AE68
/dev/sdc35             95M     0   95M   0% /media/mike/8640-3A10
/dev/sdc36             95M     0   95M   0% /media/mike/8641-93A6
/dev/sdc38             95M     0   95M   0% /media/mike/8644-AFCF
/dev/sdc39             94M     0   94M   0% /media/mike/8646-1BAE
/dev/sdc7              95M     0   95M   0% /media/mike/8612-54E8
/dev/sdc8              95M     0   95M   0% /media/mike/8613-C38C
/dev/sdc9              95M     0   95M   0% /media/mike/8615-3522

Yes I have cut the “interesting” filesystems out of that output.

Windows (7) does deal quite so well with the situation. After rebooting into Windows with the disk plugged in, the login process seemed to take longer than usual (although I don’t boot Windows enough to be sure).

Once logged in, everything seemed fine including the little popup window by the status bar saying it was configuring the plugged in disk drive. However that took longer than expected – after clicking on it for details, it took around 5 minutes to complete. At which point it stuck a red cross by the “Disk Drive” whilst it popped up an Autoplay window for drives E: through Z:. With an offer to format drive T: – so it could at least use the drive that Linux refused (by default) to format.

However except for that little red cross, there was no clear warning that it failed to do anything with nearly 80 partitions. And closing all those popup Autoplay windows was pretty tedious.

OSX (10.9) dealt a little better with the disk; it at least recognised all of the disks, and stuck up little icons for each one. And mounted them all.  However Finder didn’t seem to respond to attempts to unmount the disks … I had to resort to the command-line. Perhaps I wasn’t patient enough.

And the moral of this little crazy experiment? Whilst we can perhaps throw a little mud at Microsoft, the main lesson learnt is that you too can annoy someone using Windows by handing them an external hard disk with 100 partitions. Especially if the information they want is not in the first 20-odd partitions <Evil Grin>

Oh! And just stating the obvious – it’s a good idea to remove the partitions before putting the spare disk away, or you may encounter a nasty surprise later!

 

Sep 292013
 

As some people know, the Linux device for generating random numbers (/dev/random) blocks when there isn’t sufficient entropy to safely generate random numbers. But people will still persist on advising using /dev/urandom as an alternative :-

“To sum up, under both Linux and FreeBSD, you should use /dev/urandom, not /dev/random.”

“Just go ahead and use /dev/urandom as is”

“Oracle wants us to move /dev/random and link /dev/urandom”

“You can remove /dev/random and link that to /dev/urandom to help prevent blocking”

Now it is true that /dev/urandom is usually good enough, but to advise people to use /dev/urandom without considering whether it is sufficient or not is irresponsible. True random numbers can be very important for cryptography, and without knowing it, we use cryptography every day; such as when we browse the web, make ssh connections, check PGP keys, etc. Using a weak random number generator can weaken the cryptographic process fatally.

Aug 202013
 

Every so often I come across an old Linux box that doesn’t take kindly to being rebooted. Without console access, it is hard to see what is going on, but the Linux kernel gets stuck trying to mount the root file system. There are many possible fixes for this, but they all have one thing in common … a work-around has to be performed to get the box up and running.

The console gets stuck in a “mini-root” environment loaded when the initrd image is loaded and before the real root file system is mounted which means a lot of commands are not available, but lvm should be available. First of all, run lvm lvscan to get a list of the logical volumes that need activating :-

(initramfs) lvm lvscan
  inactive          '/dev/sys/root' [332.00 MiB] inherit
  inactive          '/dev/sys/usr' [8.38 GiB] inherit
  inactive          '/dev/sys/var' [2.79 GiB] inherit
  inactive          '/dev/sys/swap_1' [7.05 GiB] inherit
  inactive          '/dev/sys/tmp' [380.00 MiB] inherit
  inactive          '/dev/sys/home' [16.00 GiB] inherit
  inactive          '/dev/sys/opt' [24.00 GiB] inherit

For each volume group (the second column, middle word), run: lvm lvchange -ay ${volume-group-name}. In the case of my example :-

(initramfs) lvm vgchange -ay /dev/sys
  7 logical volume(s) in volume group "sys" now active

At which point you should be able to press ^D (or enter exit) to continue the boot process.

A slighter better work-around involves changing the Grub configuration to add a delay to the kernel parameters. This sections assumes that you are not using Grub Legacy!

Start by editing /etc/default/grub and changing the variable GRUB_CMDLINE_LINUX to include “rootdelay=20” :-

GRUB_CMDLINE_LINUX='console=tty0 console=ttyS0,19200n8 rootdelay=20'

Finalise by running update-grub. This adds a 20s delay to the boot process so is hardly an ideal solution.

Jul 292013
 

… which is of course massive overkill. But fun. It should increase the raw bandwidth available between the two machines from 1Gbps to 20Gbps (with one link) and 40Gbps with both links bonded. It was a bit of a surprise to me when I looked around at prices of second-hand kit to realise that InfiniBand was so much cheaper to acquire than Fibre Channel; the kit I acquired cost less than £100 all in whereas FC kit would be in the region of £1,000, and InfiniBand is generally quicker. There is of course 16Gb FC and 10Gb InfiniBand, but that is hardly comparing like with like. So what is this overkill for? Networking of course. I’ve acquired two HP InfiniBand dual link cards which means I can connect my workstation to my server :- InfiniBand Network Using dual links is of course overkill on top of overkill, but given that these cards have dual links, why not use them? And it does give a couple of experiments to try later. To prepare in advance, the following network addresses will be used :-

Server Link Number IPv4 Address IPv6 Address
A 1 10.255.0.1 AAISP:d00d::1
A 2 10.255.1.1 AAISP:d00f::1
B 1 10.255.0.254 AAISP:d00d:2
B 1 10.255.1.254 AAISP:d00f:2

Yes I have cheated for the IPv6 addresses! The first step is to configure each “server” … one is running Debian Linux, and the other is running FreeBSD.

Configuring Linux

This was subject to much delay whilst I believed that I had a problem with the InfiniBand card, but putting the card into a new desktop machine caused it to spring back to life. Either some sort of incompatibility with my old desktop (which was quite old), or some sort of problem with the BIOS settings.

Inserting the card should load the core module (mlx4_core) automatically, and spit out messages similar to the following :-

[    3.678189] mlx4_core 0000:07:00.0: irq 108 for MSI/MSI-X
[    3.678195] mlx4_core 0000:07:00.0: irq 109 for MSI/MSI-X
[    3.678199] mlx4_core 0000:07:00.0: irq 110 for MSI/MSI-X
[    3.678204] mlx4_core 0000:07:00.0: irq 111 for MSI/MSI-X
[    3.678208] mlx4_core 0000:07:00.0: irq 112 for MSI/MSI-X
[    3.678212] mlx4_core 0000:07:00.0: irq 113 for MSI/MSI-X
[    3.678216] mlx4_core 0000:07:00.0: irq 114 for MSI/MSI-X
[    3.678220] mlx4_core 0000:07:00.0: irq 115 for MSI/MSI-X
[    3.678223] mlx4_core 0000:07:00.0: irq 116 for MSI/MSI-X
[    3.678228] mlx4_core 0000:07:00.0: irq 117 for MSI/MSI-X
[    3.678232] mlx4_core 0000:07:00.0: irq 118 for MSI/MSI-X
[    3.678236] mlx4_core 0000:07:00.0: irq 119 for MSI/MSI-X
[    3.678239] mlx4_core 0000:07:00.0: irq 120 for MSI/MSI-X
[    3.678243] mlx4_core 0000:07:00.0: irq 121 for MSI/MSI-X
[    3.678247] mlx4_core 0000:07:00.0: irq 122 for MSI/MSI-X
[    3.678250] mlx4_core 0000:07:00.0: irq 123 for MSI/MSI-X
[    3.678254] mlx4_core 0000:07:00.0: irq 124 for MSI/MSI-X
[    3.678259] mlx4_core 0000:07:00.0: irq 125 for MSI/MSI-X
[    3.678263] mlx4_core 0000:07:00.0: irq 126 for MSI/MSI-X
[    3.678267] mlx4_core 0000:07:00.0: irq 127 for MSI/MSI-X
[    3.678271] mlx4_core 0000:07:00.0: irq 128 for MSI/MSI-X
[    3.678275] mlx4_core 0000:07:00.0: irq 129 for MSI/MSI-X

This is just the core driver; at this point additional modules are needed to do anything useful. You can manually load the modules with modprobe but sooner or later it is better to make sure they’re loaded automatically by adding their names to /etc/modules. The modules you want to load are :-

  1. mlx4_ib
  2. ib_umad
  3. ib_uverbs
  4. ib_ipoib

This is a minimal set necessary for networking (“IP”) rather than additional features such as SCSI. It’s generally better to start with a minimal set of features initially. At this point, it is generally a good idea to reboot to verify that things are getting closer. After a reboot, you should have one or more new network interfaces listed by ifconfig :-

ib0       Link encap:UNSPEC  HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ib1       Link encap:UNSPEC  HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Despite the appearance, we still have quite a way to go yet. The next step is to install some additional packages: ibutilsinfiniband-diags, and opensm. The last package is for a subnet manager which is unnecessary if you have an InfiniBand switch (but I don’t). The first step is to get opensm up and running. Edit /etc/default/opensm and change the PORTS variable to “ALL” (unless you want to restrict the managed ports, and make things more complicated). And start opensm: /etc/init.d/opensm start; update-rc.d opensm defaults.

At this point, you can configure the network addresses by editing /etc/network/interfaces. If you need help doing this, then you’re in the tech pool beyond your depth! Without something at the other end, these interfaces won’t work (obviously), so it’s time to start work on the other end …

Configuring FreeBSD

See: https://wiki.freebsd.org/InfiniBand I hadn’t had cause to build a custom kernel before, so the very first task was to use subversion to checkout a copy of the FreeBSD source code :-

svn co svn://svn0.us-east.FreeBSD.org/base/stable/9 /usr/src

Updating will of course require just: cd /usr/src && svn update. Once installed, create a symlink from /sys to /usr/src/sys if the link does not already exist: ln -s /usr/src/sys /sys

Go to the kernel configuration directory (/usr/src/sys/amd64/conf), copy the GENERIC configuration file to a new file, and edit the new file to add in certain options :-

# Infiniband stuff (locally added)
options         OFED
options         IPOIB_CM
device          ipoib
device          mlx4ib

Again, this is a minimal set that will not offer full functionality … but should be enough to get IP networking up and running. The next step is to build and install the kernel :-

make buildkernel KERNCONF=${NAME-OF-YOUR-CONFIG}; make installkernel KERNCONF=${NAME-OF-YOUR-CONFIG}

The next step is to build the “world”  :-

  1. Edit /etc/src.conf and add “WITH_OFED=’yes'” to that file.
  2. Change to /usr/src and run: make buildworld
  3. Finalise with make installworld

As it happens I had to build the user-land first, as the kernel compilation needed a new user-land feature.

After a reboot, the new network interface(s) should show up as ib0 upwards. And these can be configured with an address in exactly the same as any other network interface.

Testing The Network

A tip for making sure the interfaces you think are connected together is to configure one of the machines, send a broadcast ping to the relevant network address of each interface in turn, and run tcpdump on the other machine to verify that the packets coming down the wire match what you expect.

Below the level of IP, it is possible to run an InfiniBand ping to verify connectivity. First you need a GUID on “the server”, which can be obtained by running ibstat and looking for the “Port GUID”, which will be something like “0x0002c90200273985”. Next run ibping -S on the server.

Now on the other machine (“the client”), run ibping :-

# ibping -G 0x0002c90200273985
Pong from polio.inside.zonky.org (Lid 3): time 0.242 ms
Pong from polio.inside.zonky.org (Lid 3): time 0.153 ms
Pong from polio.inside.zonky.org (Lid 3): time 0.160 ms

The next step is to run an IP ping to one of the hosts. If that works, it is time to start looking at something that will do a reasonable attempt at a speed test.

This can be done in a variety of different ways, but I chose to use nttcp which is widely available. On one of the hosts, run nttcp -i to act as the “partner” (or server). On the sending server, run nntcp -T ${ip-address-to-test} which will give output something like :-

# nttcp -T 10.0.0.26
     Bytes  Real s   CPU s Real-MBit/s  CPU-MBit/s   Calls  Real-C/s   CPU-C/s
l  8388608    0.70    0.01     95.7975   5592.4053    2048   2923.51  170666.7
1  8388608    0.71    0.04     94.0667   1878.6950    5444   7630.87  152403.3

According to the documentation, the second line should begin with ‘r’, but for a simple speed test we can simply average the numbers in the “Real-MBit/s” to get an approximate speed. Oddly my gigabit ethernet seems to have mysteriously degraded to 100Mbps! At least it makes the InfiniBand speed slightly more impressive :-

# nttcp -T 10.255.0.2
     Bytes  Real s   CPU s Real-MBit/s  CPU-MBit/s   Calls  Real-C/s   CPU-C/s
l  8388608    0.03    0.00   2521.9415  16777.2160    2048  76963.55  512000.0
1  8388608    0.03    0.03   2206.6574   2568.6620    4032 132579.25  154329.0

Before getting into a panic over what appears to be a pretty poor result, it is worth bearing in mind that IP over InfiniBand isn’t especially efficient, and InfiniBand seems to suffer from marketing exaggeration. From what I understand, DDR’s 20Gbps signalling rate becomes 16Gbps, which in turn becomes 8.5Gbps when looking at the output of ibstatus (not ibstat) – why the halving here is a bit of a mystery, but that may become apparent later.

There has also been a hint that FreeBSD is due for a significant improvement in InfiniBand performance sometime after the release of 9.2.

As a late addition, it would appear that running OpenSM (the subnet manager) on both hosts means that when one or other is rebooting, the other can take over the duties of the subnet manager. To enable on FreeBSD, simply add opensm_enable=”YES” to the file /etc/rc.conf and reboot.

Oct 312011
 

According to a couple of articles on The Register, a couple of manufacturers are getting close to releasing ARM-based servers. The interesting thing is that the latest announcement includes details of a 64-bit version of the ARM processor, which according to some people is a precondition for using the ARM in a server.

It is not really true of course, but a 64-bit ARM will make a small number of tasks possible. It is easy to forget that 32-bit servers (of which there are still quite a few older ones around) did a pretty reasonable job whilst they were in service – there is very little that a 64-bit server can do that a 32-bit server cannot. As a concrete example, a rather elderly SPARC-based server I have access to has 8Gbytes of memory available, is running a 64-bit version of Solaris (it’s hard to find a 32-bit version of Solaris for SPARC), but of the 170 processes it is running, none occupies more than 256Mbytes of memory; by coincidence the size of processes is also no more than 256Mb.

The more important development is the introduction of virtualisation support.

The thing is that people – especially those used to the x86 world – tend to over-emphasise the importance of 64-bits. It is important as some applications do require more than 4Gbytes of memory to support – in particular applications such as large Oracle (or other DBMS) installations. But the overwhelming majority of applications actually suffer a performance penalty if re-compiled to run as 64-bit applications.

The simple fact is that if an application is perfectly happy to run as a 32-bit application with a “limited” memory space and smaller integers, it can run faster because there is less data flying around. And indeed as pointed out in the comments section of the article above, it can also use ever so slightly more electricity.

What is overlooked amongst those whose thinking is dominated by the x86 world, is that the x86-64 architecture offers two benefits over the old x86 world – a 64-bit architecture and an improved architecture with many more CPU registered. This allows for 64-bit applications in the x86 world to perform better than their 32-bit counterparts even if the applications wouldn’t normally benefit from running on a 64-bit architecture.

If the people producing operating systems for the new ARM-based servers have any sense, they will quietly create a 64-bit operating system that can transparently run many applications in 32-bit mode. Not exactly a new thing as this is what Solaris has done on 64-bit SPARC based machines for a decade. This will allow those applications that don’t require 64-bit, to gain the performance benefit of running 32-bit, whilst allowing those applications that require 64-bit to run perfectly well.

There is no real downside in running a dual word sized operating system except a minor amount of added complexity for those developers working at the C-language level.

Jun 182011
 

This is a series of notes on dealing with PC malware (viruses, worms and the like) gathered because I’m looking into it and published as a way of reminding myself about this stuff. Bear in mind that I’m not an expert but neither am I a complete dunce – I’m normally a Unix or Linux person but I’ve been keeping half an eye on Windows infections for years.

Some links to tools are contained within. However you should be aware that tool recommendations change over time; you will need to check how outdated this document is before following any recommendations blindly.

At present this blog entry is a work in progress … lots of testing needs to be done before being confident this is right.

Cleanup Process

This is not :-

  1. How to approach this forensically – if you’re dealing with an investigation, it’s a whole other ball game and you probably need professional assistance to avoid corrupting evidence.
  2. A technical guide as to which tools to use.

1. For The Ultra Cautious Or When Handling Real Important Data

The process of removal can be destructive, and in the worst cases you can end up cleaning the malware and ending up with a brick. So make an image of the hard disk as it is. Two basic ways this can be done :-

  1. Removing the hard disk from the infected machine, attaching to an appropriate machine (USB->SATA, USB->IDE converters are handy here), and making an image of the disk.
  2. Booting off a “rescue” CD on the infected machine, and imaging the hard disk to a network share of some kind. This is the preferred option.

This will be slow. So be it. Cleaning an infected PC is not going to be a quick job whatever you do. The best you can hope for is that there are many periods where you can leave it churning away and get on with something else.

2. Boot A Rescue CD

There are those who tell you that there is no need to boot off a known uninfected disk to clean an infected machine; their anti-malware/virus product can clean an infected machine “live”. There are others who claim that the only way to be sure is to boot off that disk and clean the machine that way. Both are wrong.

If you are paranoid (and in the presence of malware paranoia is fully justifiable), you will do both.

3. Boot Infected Machine and Clean

As suggested previously after booting off a rescue disk and cleaning, boot the infected machine and clean again.

Tools

The following is a list of rescue CD’s that have been suggested :-

  • UBD4Win. Has to be “built” with the assistance of an XP installation; somewhat tedious but it isn’t the end of the world. However it does need preparing in advance – building a rescue CD with the assistance of an infected machine isn’t the most sensible idea!
  • Knoppix. Graphical, pretty, feature packed, but seems to be lacking in anti-malware tools (for instance the only AV tool included is Clam).
  • Trinity Rescue Disk. Menu interface. Virus definitions update over the net; choice of Clam, F-Prot, Bitdefender, Vexira, AVast (need to obtain license key). Various other utilities.
  • F-Secure Rescue CD.

Some of the above are Windows based; some are Linux based. The choice of which to use should be based on results not whether they tickle your prejudices (or mine!).

The following is a list of “live” tools to be installed that have been suggested :-

Asides

Nothing to do with the main subject. Merely some notes worth mentioning.

It seems that at least some malware can detect it is running within a virtual environment. In some cases it ceases to do anything, and in others may try to “break out”. This indicates that analysing malware within a virtual environment may not give sensible results, and in some cases may be dangerous! That is not to say that using a virtual environment is no longer of any use, but you may need to take special case such as running the virtual environment under Linux and/or ESX rather than Windows. And be careful about negative results.

WP Facebook Auto Publish Powered By : XYZScripts.com

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close