Apr 102019
 

So earlier today, I had a need to mount a disk image from a virtual machine on the host, and discovered a “new” method before remembering I’d made notes on this in the past. So I’m recording the details in the probably vain hope that I’ll remember this post in the future.

The first thing to do is to add an option to include partition support in the relevant kernel module, which I’ve done by adding a line to /etc/modprobe.d/etc-modules-parameters.conf :-

options nbd max_part=63

The next step is to load the module:

# modprobe nbd

The next is to use a Qemu tool to connect a disk image to a network block device :-

# qemu-nbd -r -c /dev/nbd0 /home/mike/lib/virtual-machine-disks/W10.vdi
# ls /dev/nbd0*
/dev/nbd0  /dev/nbd0p1  /dev/nbd0p2  /dev/nbd0p3

And next mount the relevant partition :-

# mount -o ro /dev/nbd0p2 /mnt

All done! Except for un-mounting it and finally disconnecting the network block device :-

# umount /mnt
# ls /dev/nbd0*
/dev/nbd0  /dev/nbd0p1  /dev/nbd0p2  /dev/nbd0p3
# qemu-nbd -d /dev/nbd0
/dev/nbd0 disconnected
# ls /dev/nbd0*        
/dev/nbd0

The trickiest part is the qemu-nbd command (so not very tricky at all).

The “-r” option specifies that the disk image should be connected read-only, which seems to be sensible when you’re working with a disk image that “belongs” to another machine. Obviously if you need to write to the disk image then you should drop the “-r” (but consider cloning or taking a snapshot).

The “-c” option connects the disk image to a specific device and the “-d” option disconnects the specific device.

Old Metal 2
Feb 242019
 

Normally when you set an IP address manually on an interface you do not get a whole lot of choice of how it is done – very often you have to specify the IP address itself and a network mask. The addresses and masks are almost always specified as “dotted quads” (10.0.0.1) rather than the real address in binary or decimal (167772161).

The network mask specifies what parts of the IP address are the network address and which are the host address – to determine whether a destination needs to go via a gateway or is on the local network. This is expressed as a bitmask like 255.255.255.0. Having said that, rarely some devices (Cisco routers in the dustier parts of their code) require the reverse – 0.0.0.255.

An alternative approach is to use the CIDR format to specify both the IP address of the device and the size of the network – 10.2.9.21/24. This is used (at least) on Palo Alto Networks firewalls and is probably the simplest way of configuring a network address I have come across.

Having configured hundreds of devices with static addresses … and helped solve oodles of network configuration issues, I feel that the CIDR format method is likely to be far less error prone.

If you do need to set a netmask, use ipcalc to check what it is (and use it to cut&paste rather than risk typos) :-

✓ mike@pica» ipcalc 10.2.9.21/24 
Address:   10.2.9.21            00001010.00000010.00001001. 00010101
Netmask:   255.255.255.0 = 24   11111111.11111111.11111111. 00000000
Wildcard:  0.0.0.255            00000000.00000000.00000000. 11111111
=>
Network:   10.2.9.0/24          00001010.00000010.00001001. 00000000
HostMin:   10.2.9.1             00001010.00000010.00001001. 00000001
HostMax:   10.2.9.254           00001010.00000010.00001001. 11111110
Broadcast: 10.2.9.255           00001010.00000010.00001001. 11111111
Hosts/Net: 254                   Class A, Private Internet
Through The Gateway
Feb 092019
 

One of the things that irritates me about fancy new service management systems like systemd is that unless you get everything exactly right, you can end up with things interfering with specific configuration files – specifically /etc/resolv.conf.

Now as a DNS administrator, I have a certain fondness for manually controlling /etc/resolv.conf and it does actually come in useful for making temporary changes to test specific DNS servers and the like. The trouble comes when something else wants to control that file.

The ideal fix for this conflict is to have things like systemd control a separate file such as /etc/system/resolv.conf.systemd, and for /etc/resolv.conf be installed as a symbolic link pointing at the real file.

But back in the real world, if you do disable systemd-resolver which can be done with: systemctl disable systemd-resolved.service; systemctl stop systemd-resolved.service

Then you may also want to make the file immutable: chattr +i /etc/resolv.conf. On at least one server, systemd merrily re-created /etc/resolv.conf as a symbolic link to an empty file despite systemd-resolved being disabled.

Corner Of The Pyramid
Oct 032018
 

I have a Python script that over-simplifying, reads very large log files and runs a whole bunch of regular expressions on each line. As it had started running inconveniently slowly, I had a look at improving the performance.

The conventional wisdom is that if you are reading a file (or standard input), then the simplest method is probably almost always the fastest :-

for line in logstream:
    processline(line)

But being stubborn, I looked at possible improvements and came up with :-

from itertools import islice
    
while True:
    buffer = list(islice(logstream, islicecount))
    if buffer != []:
        for line in buffer:
             processline(line)
    else:
        break

This code has been updated twice because the first version added a splat to the output and the second version (which was far more elegant) didn’t work. The final version 

This I benchmarked as being nearly 5% quicker – not bad, but nowhere near enough for my purposes.

The next step was to improve the regular expressions – I read somewhere that .* can be expensive and that [^\s]* was far quicker and often gave the same result. I replaced a number of .* occurrences in the “patterns” file and re-ran the benchmark to find (in a case with lots of regular expressions) the time had dropped nearly 25%.

The last step was to install nuitka to compile the Python script into a binary executable. This showed a further 25% drop – a script that started the day taking 15 minutes to run through one particular run ended the day taking just under 8 minutes.

The funny thing is that the optimisation that took the longest and had the biggest effect on the code showed the smallest improvement!

Four Posts
Aug 092018
 

Well that was a weird error; I recently discovered that ntpd had mysteriously stopped working; specifically it was not able to resolve NTP “pool” names :-

ntpd: error resolving pool europe.pool.ntp.org: Name or service not known (-2)

After some time spent blundering around down dead ends with the help of an appropriate search engine, I ended up resorting to strace. This is a tool most commonly used by developers but can be surprisingly useful for diagnosing system problems too.

As long as you can look past all the inscrutable output!

The strace tool runs a command and records every system call that the command calls together with the results. And of course most commands make zillions of system calls so you’re likely to end up with a huge output file.

To generate the output file, I ran the modern equivalent of ntpdate (ntpd -d) which tries to do the same thing using the actual NTP daemon. Usefully in this case because the command starts, configures itself (which is where the error occurs), and then exits (unlike the normal dæmon). It is important to redirect the output to have a file to trawl through later :-

strace ntpd -d > /var/tmp/ntpd.strace 2>&1

Once the output was generated, it was necessary to trawl through it to look for clues. The first thing was to search for “europe” (as I use europe.pool.ntp.org as one of my NTP servers). The first occurrence was the error claiming that the name didn’t exist :-

write(2, "error resolving pool europe.pool"..., 73error resolving pool europe.pool.ntp.org: Name or service not known (-2)

Which was somewhat odd because you would expect the string “europe” to occur within an instructable attempt resolve the name. Yet it appears as though the error occurs without any attempt to resolve the name!

As a bit of a guess I searched for “resolv.conf” which revealed :-

stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=362, ...}) = 0
openat(AT_FDCWD, "/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)

Apparently ntpd is unable to open the file due to a permissions problem!

Looking at my /etc/resolv.conf revealed an oddity dating back to when I tried configuring /etc/resolv.conf as a symbolic link to a file on a separate file system. The file itself was a symbolic link to /etc/resolv.conf.file.

For some reason ntpd didn’t like the symbolic link, which is a bit odd but changing it to an ordinary file fixed the problem.