Aug 022014
 

One of the questions I always ask myself when setting up a resilient server, is just how well will it cope with a disk failure? Ultimately you cannot answer that without trying it out.

But as practice (and to determine whether it mostly works), it’s perfectly sensible to try it out on a virtual machine.

Debian Installation

If you are looking for full instructions on installing Debian, this is not the place to look. I configured the virtual machine with 2GBytes of memory, an LsiLogic SAS controller with two attached disks each of 64GBytes.

The installation process was much as per normal (I unselected “Desktop” to save time), but the storage was somewhat different :-

  • Manual partitioning method
  • Create an empty partition on both disks
  • Select Software RAID
  • Create an MD device
  • RAID1
  • And put both disks into the RAID
  • Configure LVM
  • Create a Volume Group (“sys”)
  • Select md0 for the volume group device
  • Create logical volumes (boot: 512MB, root: 16GB, var: 8GB, home: 512M (it’s a server))
  • In the partitioning manager select each Logical Volume in turn and specify the file system parameters.

You will notice that no swap was created – this was a mistake that I’m in the unfortunate habit of making! However for a test, it wasn’t a problem and with LVM it is possible to create swap after the installation.

Post Installation

After the server has booted, it is possible to check the second hard disk for the presence of grub in the MBR (dd if=/dev/sdb of=/var/tmp/sdb.boot bs=1M count=1, and then run strings on the result). It turns out that nothing is installed in the MBR of the second disk by default. Which would make booting in a degraded environment an interesting challenge (i.e. you’ll have to find a rescue CD and boot off the relevant hard disk).

However this can be fixed by installing grub onto the second hard disk: grub-install /dev/sdb

Testing Resilience

But what happens when you lose a disk? Now is the time to test. Shut down the virtual machine and remove the second hard disk – leaving the first hard disk in place does not provide a full test.

If your first attempt at booting afterwards results in a failure to acquire a grub menu, then either you have failed to run grub-install as detailed above (guess what mistake I made?), or your BIOS settings don’t permit the computer to boot off anything other than the first hard disk.

However, in my second attempt, the server booted normally with the addition of a few messages that indicate that there is just one disk making up the mirrored pair.

Summary

  1. Yes, you can put /boot onto an LVM file system that sits on mirrored disks. That hasn’t always been the case.
  2. It is still necessary to run grub-install to put Grub onto the MBR of the second hard disk.
  3. It works.
Jun 262014
 

Came across a hint today about reporting on ECC memory errors. For those who do not know, ECC memory detects memory errors and corrects correctable errors. Normal memory (as found in almost all laptops and desktops) simply ignores the errors and lets them accumulate and cause problems either with data corruption or by causing software errors.

As I happen to have ECC memory in my desktop machine I thought I would have a look into the hint. Turns out that Linux does not report on ECC events automatically; you need to install the relevant EDAC (Error Detection and Correction) tools. Which for Debian, turns out to be pretty simple :-

# apt-get install edac-utils

As part of the installation process, a daemon process is started. But for whatever reason, it didn’t automatically detect what driver to load. So I edited /etc/default/edac and added :-

EDAC_DRIVER=amd64_edac_mod

Once that is done, a simple /etc/init.d/edac restart loads the driver and starts monitoring. Messages should appear in your log files (/var/log/messages) and reports can be displayed with edac-util :-

# edac-util --report=full 
mc0:csrow0:mc#0csrow#0channel#0:CE:0
mc0:csrow0:mc#0csrow#0channel#1:CE:0
mc0:csrow1:mc#0csrow#1channel#0:CE:0
mc0:csrow1:mc#0csrow#1channel#1:CE:0
mc0:csrow2:mc#0csrow#2channel#0:CE:0
mc0:csrow2:mc#0csrow#2channel#1:CE:0
mc0:csrow3:mc#0csrow#3channel#0:CE:0
mc0:csrow3:mc#0csrow#3channel#1:CE:0
mc0:noinfo:all:UE:0
mc0:noinfo:all:CE:0

Of course memory errors are relatively rare (or at least should be) so it may take months before any error is reported.

Jan 162014
 

This is not original work, but merely a set of notes on how to do the set up. The core information (and the code) came from this blog posting.

Essentially I’ve re-ordered the steps in which to work and excluded anything other than the bare essentials to get it all working. With the intention I can get my missile launcher working at home and at work  😎

Step 1 is to prevent the HID driver from clamping on to the missile launcher. This was done by :-

  1. Editing /etc/default/grub and adding usbhid.quirks=0x2123:0x1010:0x04 to the existing variable GRUB_CMDLINE_LINUX_DEFAULT.
  2. Run update-grub (I always manage to forget this).
  3. Reboot the machine and check /var/log/messages for 2123 (the VendorID) to see if it has been claimed by usbhid (which will show up as a line beginning generic-usb 0003:2123:1010.0006: hiddev0 if it does claim it).

The next step is to download and compile the code given in the blog link above. If you need instructions on how to do this, then you probably need to look elsewhere – it builds easily.

Once built, an sudo insmod launcher_driver.ko will verify that the kernel module loads – you can double check by looking at /var/log/messages.

It’s also necessary to install both the kernel driver and the control program manually :-

  1. Copy the compiled kernel module to /lib/modulessudo cp  launcher_driver.ko /lib/modules
  2. Edit /etc/rc.local and add the command: /sbin/insmod /lib/modules/launcher_driver.ko
  3. Copy the control program to a sensible location: sudo install launcher_control /opt/bin

There’s probably better ways of doing this, and better places to stick things but as you’re following my instructions you’re stuck with my suggestions! It’s tempting to try a reboot at this stage to verify that this works, but as there’s just one small extra step we may as well get that done too. This is to create a udev rule to set up a device file in /dev.

Create a file (/etc/udev/rules.d/99-usb-launcher.rules) with the following contents :-

KERNEL=="launcher?*",MODE="0660",GROUP="cdrom"

The choice of group name is rather inappropriate except it will work well enough, and I have changed the permissions on this to something a little more restrictive. This can be tested with sudo udevadm trigger which will re-run udev. This should change the permissions on any existing /dev/launcher* file(s). If it doesn’t work, the blog pointer above is the place to head.

Lastly, there’s a couple of corrections to the launcher_control.c that is convenient to make :-

% diff launcher_control.c launcher_control.c.orig
63c63
<         while ((c = getopt(argc, argv, "m:lrudfsht:")) != -1) {
---
>         while ((c = getopt(argc, argv, "mlrudfsht:")) != -1) {
97,98c97
< 		fprintf(stderr, "Couldn't open file: %s\n", dev);
<                 /*perror("Couldn't open file: %m");*/
---
>                 perror("Couldn't open file: %m");

 

Oct 132013
 

I discovered this cool feature of Linux quite by accident. zRAM is a block device (i.e. a “disk”) where the contents are compressed and stored in memory, which makes it sound rather mundane and hardly very interesting. However in use, it does appear to be quite nifty; sufficiently so that Google are enabling it for Chrome OS. So why?

The way that it is usually configured is as a swap space … so in effect, zRAM is used to compress normal memory, trading processor utilisation for more memory. What should happen is that instead of hitting the performance brick wall of suddenly paging to disk when you hit the memory limits of your machine, the zRAM is used instead eating a bit of processor time but with any luck keeping everything within memory rather than going to disk. It should have no effect during normal operation, but during temporary surges of memory utilisation, it should allow things to proceed at more or less normal performance.

That’s the theory anyway; but if it were not the case would Google be enabling it by default?

Of course in addition to using it as a swap device, there are other possible uses for zRAM devices :-

  1. As an L2ARC cache device for those using ZFS.
  2. To use as a block device for very hot disk spots in examples such as Exim’s retry database – which can be safely discarded on reboot.
  3. Or any other cache whose contents can be safely discarded at any point.

The last point is worth remembering. Because zRAM devices are contained within main memory, their contents are discarded when the power goes away.

Configuration

To use zRAM, we need to load the zRAM module, and choose how many devices to make at the same time. Some people believe that it makes sense to create as many devices as you have cores, as that gives each core (or thread) a device to spend it’s time compressing. To do this, we add the following to the /etc/rc.local file (assuming a Debian system) :-

/sbin/modprobe zram zram_num_devices=$(cat /proc/cpuinfo | grep processor | wc -l)

By default the zRAM will allocate 25% of the main memory to all of the zRAM devices; personally I think that is reasonable enough. However it seems that as soon as you set the number of devices, the size defaults to zero … so we have to set the size of the device as we configure it. Once created, you will have to decide how to use the devices. In my case, I wanted to use half of the devices for swap and half for L2ARC, which I did by adding the following to /etc/rc.local :-

size=$(( ($(cat /proc/meminfo | awk '/^MemTotal/ {print $2}')*1024) / (4 * $(cat /proc/cpuinfo| grep "^processor" | wc -l)) ))
#       Complex way of determining the size of each zRAM device
for dev in /dev/zram*
do
  base=$(basename $dev)
  echo $size > /sys/block/${base}/disksize
  odd=$(( $(echo $dev | sed -e "s/^.*zram//") % 2 ))
  if [ $odd = 0 ]
  then
    /sbin/mkswap $dev
    /sbin/swapon -p 32767 $dev
  else
    zpool remove pool0 $dev > /dev/null 2>&1
    zpool add pool0 cache $dev
  fi
done

This is a rather complex way of doing it, and doesn’t contain much in the way of error checking, but it does work.

Sep 292013
 

I know … we’re all supposed to use graphical music players these days. I tried … honest, but I just couldn’t find one I liked well enough. This one had a habit of crashing randomly, that one was too database driven, this one was worried too much about a good interface for streaming music, that one liked play lists too much.

What I need in an audio player is :-

  1. The ability not to play a wide choice of audio codecs, but at least the codecs I use for encoding audio (FLAC) plus codecs for audio that gets downloaded – MP3, and OGG.
  2. The ability to play audio files from the filesystem without imposing some sort of database driven interface – specifically it shouldn’t say “Hey! I’ve noticed that your media files have changes; I’ll just spend 20 minutes rebuilding the database before I’ll let you play anything”.
  3. To start quickly and to quickly let me pick the audio files I want to play. Spending time figuring out what I was doing last time is unnecessary.

Note the lack of any fancy graphical interface or the requirement to plugin extras such as a link to last.fm such features are fine, but for me unnecessary.

So I found moc (Wikipedia link because the official website was broken when I wrote this) … an “command-line” (actually text screen based) music player with extensive configuration options. This is not so much a review as a discovery of the configuration options …

The first thing to find out is what file moc uses for configuration. Simple: ~/.moc/config. By making changes to this, and restarting moc I was able to make gradual improvements. The ordering of the options below is in the order of my discoveries which was greatly assisted by reading the example configuration in /usr/share/doc/moc/examples/config.example.gz

First, I wanted moc to start off by automatically changing to a specific directory :-

MusicDir = /media/ibox/albums
StartInMusicDir = yes

Next, turn off the use of mmap() as it is apparently slow on NFS and my music files are on an NFS server :-

UseMmap = no

The end result is a simple player that works in a terminal window.