Sep 252013
 

If you suspect a networking problem, how do you go about diagnosing that problem?

As in all problem solving, the process involves gathering information and performing tests. To adequately perform some of the tests, you need to prepare in advance – by obtaining copies of tools, creating a USB stick with the tools on, finding out how to use the tools, etc. You cannot expect to be able to perform anything useful without investing in that preparation time.

As an alternative to preparing a USB stick full of tools, it may be preferable to prepare a netbook with the tools on – at the very least swapping a network connection to a known working netbook will tell you whether the problem is in the computer or in the network!

Get The MAC Man!

The MAC address of the network connection is probably the single most important bit of information to get your hands on. Because it is the key for obtaining lots of other information – whether dhcp requests are being seen, whether the Ethernet switch can see that MAC address on any of it’s ports … or the expected port, etc. If you report a network issue without the MAC address of the machine in question, someone will bang their head on the desk. If you are locked out of the machine because the network “isn’t working”, and so are unable to run the usual tools to get at the MAC address, report that as a fault.

Obtaining the MAC address varies according to the operating system you want to get it from, and the method you choose to use to get it. I have chosen to document a command-line method; if this makes you unhappy, please feel free to document the graphical way, and I’ll add a link to it. In some cases, you will be choosing which MAC address is relevant to the active network card. If in any doubt, get all the MAC addresses, and suggest which one you think is the active network card; if it turns out you have guessed wrong, at least the right one will be in the list somewhere!

Windows

Start a command line, and run ipconfig :-

C:\Users\msm>ipconfig/all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : w7
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Peer-Peer
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : single-names.port.ac.uk
                                       iso.port.ac.uk
                                       eps.is.port.ac.uk

Ethernet adapter Local Area Connection:

   Connection-specific DNS Suffix  . : inside.zonky.org
   Description . . . . . . . . . . . : Intel(R) PRO/1000 MT Desktop Adapter
   Physical Address. . . . . . . . . : 08-00-27-84-0A-B4
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv4 Address. . . . . . . . . . . : 10.0.2.15(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : 15 September 2013 12:02:40
   Lease Expires . . . . . . . . . . : 16 September 2013 12:02:40
   Default Gateway . . . . . . . . . : 10.0.2.2
   DHCP Server . . . . . . . . . . . : 10.0.2.2
   DNS Servers . . . . . . . . . . . : 10.0.0.26
   NetBIOS over Tcpip. . . . . . . . : Enabled

Tunnel adapter Local Area Connection* 9:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
   Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2001:0:9d38:953c:2c67:1675:f5ff:fdf0(Pre
erred)
   Link-local IPv6 Address . . . . . : fe80::2c67:1675:f5ff:fdf0%11(Preferred)
   Default Gateway . . . . . . . . . : ::
   NetBIOS over Tcpip. . . . . . . . : Disabled

Tunnel adapter isatap.inside.zonky.org:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . : inside.zonky.org
   Description . . . . . . . . . . . : Microsoft ISATAP Adapter #2
   Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes

Windows is being “helpful” here and listing all of the network adapters it knows of. Including the ones that are not plugged in. To find the address we want, we look for the “Ethernet adapter Local Area Connection” section, and within that look for the “Physical Address” which is given here as 08-00-27-84-0A-B4

Linux and OSX

Linux and OSX are pretty similar at this level – with the exception that linux calls Ethernet devices ethN (usually), and OSX calls ’em enN, the command and output is pretty much the same.

Again, start a command-line interface and run the command ifconfig :-

% ifconfig
eth0      Link encap:Ethernet  HWaddr 60:a4:4c:62:84:71  
          inet addr:10.0.0.28  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::62a4:4cff:fe62:8471/64 Scope:Link
          inet6 addr: 2001:8b0:ca2c:dead::babe/64 Scope:Global
          UP BROADCAST RUNNING MULTICAST  MTU:1492  Metric:1
          RX packets:170663945 errors:0 dropped:0 overruns:0 frame:0
          TX packets:183200664 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:122771869945 (114.3 GiB)  TX bytes:170314898179 (158.6 GiB)
          Interrupt:73 Base address:0x2000 

ib0       Link encap:UNSPEC  HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.255.0.1  Bcast:10.255.0.255  Mask:255.255.255.0
          inet6 addr: fe80::21a:4bff:ff0c:e1c5/64 Scope:Link
          inet6 addr: 2001:8b0:ca2c:d00d::1/64 Scope:Global
          UP BROADCAST RUNNING MULTICAST  MTU:4096  Metric:1
          RX packets:6037892 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12155324 errors:0 dropped:3079 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:314594872 (300.0 MiB)  TX bytes:21890697854 (20.3 GiB)

ib1       Link encap:UNSPEC  HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.255.1.1  Bcast:10.255.1.255  Mask:255.255.255.0
          inet6 addr: fe80::21a:4bff:ff0c:e1c6/64 Scope:Link
          inet6 addr: 2001:8b0:ca2c:d00f::1/64 Scope:Global
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:4466937 errors:0 dropped:0 overruns:0 frame:0
          TX packets:429108 errors:0 dropped:47 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:232871358 (222.0 MiB)  TX bytes:17179018366 (15.9 GiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:65309 errors:0 dropped:0 overruns:0 frame:0
          TX packets:65309 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4625183 (4.4 MiB)  TX bytes:4625183 (4.4 MiB)

This is an unusually complex configuration, but the MAC address can be picked out relatively easily. Just look for the ethN (here it’s “eth0”) and pick out the “HWaddr” which is 60:a4:4c:62:84:71 in this example.

Network Sniffing with Wireshark

Wireshark is a premium graded packet sniffer and packet analysis tool; it’s a tool that really justifies a complete book. But you can get quite a bit done with far less knowledge.

The absolute basic is how to capture packets. This should be fairly easy to accomplish from the graphical interface – it’s pretty much a case of picking a network interface to capture from, and clicking “Start”. Once you have captured 30 seconds or so of traffic, click the red cross, and save the result. All done.


Warning: Contains an enthusiastic American! 

Detailing exactly what you might see in a packet capture is definitely beyond the scope of this blog entry, but there are basically three different kinds of packets you should see :-

  1. Packets sent by your machine. Until you get to more advanced levels, these contain very little in the way of useful information.
  2. Packets sent to your machine. That is they are addressed specifically with your machine in mind. These are also to be ignored at this level.
  3. Finally packets sent out in broadcast mode – to every machine on the network.

The final category can tell you on which network you are … if you are connected to some kind of “special” private network, it is to be expected that an ordinary PC (or Mac) won’t work properly. If you look at enough examples of packet captures, it should eventually become evident what packets are broadcast ones, and what the contents of those packets mean :-

# tshark -i eth0.24 arp 
tshark: Lua: Error during loading:
 [string "/usr/share/wireshark/init.lua"]:45: dofile has been disabled
Running as user "root" and group "root". This could be dangerous.
Capturing on eth0.24
  0.000000 84:78:ac:19:64:41 -> Broadcast    ARP 60 Who has 148.197.24.2?  Tell 148.197.24.252

The packet in question is on the last line. It’s an ARP packet where a machine is asking if anyone knows the Ethernet address of 148.197.24.2 … which is a pretty good indication you are connected to that network.

A Better Ping

The standard ping tool is very useful, but it has a couple of one big disadvantages :-

  1. Machines with an aggressive firewall may not permit ICMP (i.e. ping) packets through. In which case they do not respond to standard pings.
  2. Because ping uses ICMP packets, it is subject to the lower priority that ICMP packets have … in the event of an overloaded network, routers and switches will prefer to drop ICMP to keep TCP and UDP packets flowing. This can result in a false impression of the network reliability.

Because of this, there have been a variety of different tools that accomplish the same sort of thing as ping by using either TCP or UDP (or even ICMP) packets. The latest and greatest of these tools is nping which is part of the nmap series of tools, and is available for just about every platform (including Windows). The default for nping is to send just 5 packets :-

# nping --tcp -p 22 10.0.0.28

Starting Nping 0.6.25 ( http://nmap.org/nping ) at 2013-09-23 20:56 BST
SENT (0.0058s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40  seq=3907311816 win=1480 
RCVD (0.0062s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44  seq=4059830626 win=14520 
SENT (1.0060s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40  seq=3907311816 win=1480 
RCVD (1.0066s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44  seq=4075461628 win=14520 
SENT (2.0070s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40  seq=3907311816 win=1480 
RCVD (2.0075s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44  seq=4091100198 win=14520 
SENT (3.0080s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40  seq=3907311816 win=1480 
RCVD (3.0084s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44  seq=4106740813 win=14520 
SENT (4.0090s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40  seq=3907311816 win=1480 
RCVD (4.0094s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44  seq=4122381613 win=14520 

Max rtt: 0.451ms | Min rtt: 0.259ms | Avg rtt: 0.342ms
Raw packets sent: 5 (200B) | Rcvd: 5 (230B) | Lost: 0 (0.00%)
Tx time: 4.00449s | Tx bytes/s: 49.94 | Tx pkts/s: 1.25
Rx time: 4.00476s | Rx bytes/s: 57.43 | Rx pkts/s: 1.25
Nping done: 1 IP address pinged in 4.01 seconds

The key information is displayed at the end … specifically the Max rtt (round trip time) which tells you how long it took for the slowest “conversation” to take place, and the “Lost” count of the number of packets lost. There are zillions of options to nping, but some of the most important include :-

Option Description
–tcp Use TCP probe mode, which is probably the preferred mode for testing
-p N Specify the destination port to probe. This can be either open (i.e. a service is running) or closed, but not firewalled.
–count N Tell nping how many packets to send. Increasing this can make the test much longer.
–delay Nms How long to wait between each packet. Always specify “ms” as a suffix to give you milliseconds. A delay of about 50ms is reasonable.

There’s a great deal more to nping than just this, but it’s a start.

How Fast? How Slow?

Does the network connection feel slow? Just how slow does it feel? Measure it

It is not uncommon to find that a network performance issue is actually a performance issue of some other kind. Measuring the network performance can tell you whether it really is the network, or something else. To do so, you need the right tool; measuring with the wrong tool can result in very inaccurate measurements.

Often people resort to using ftp to transfer large files back and forth, which works well enough in normal circumstances, but at higher network speeds you can find yourself measuring the speed of a slow hard disk rather than the network performance. So use the right tool – such as iperf which is available for all major platforms including Windows.

There is an additional tool available for Windows which offers a graphical interface, but I am describing the command-line interface. Partially because that’s the way I am, but partially because it is dead simple.

To run iperf you need to have the software installed on a client machine and a server machine. To run on a server, simply :-

$ iperf -p 32765 -s

Specifying the port number isn’t normally necessary, but I suggest choosing a random port around 32,000 to avoid conflicts. Just remember the port number you use! And on the client :-

% iperf -p 32765 -c polio
------------------------------------------------------------
Client connecting to polio, TCP port 5001
TCP window size: 23.4 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.28 port 36114 connected with 10.0.0.26 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   113 MBytes  95.0 Mbits/sec
% iperf -p 32765 -c 10.255.0.2
------------------------------------------------------------
Client connecting to 10.255.0.2, TCP port 5001
TCP window size: 28.8 KByte (default)
------------------------------------------------------------
[  3] local 10.255.0.1 port 43978 connected with 10.255.0.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  2.73 GBytes  2.35 Gbits/sec

I have admittedly cheated here by running two tests … to show what a normal 100Mbps ethernet speed should look like, and what something a bit quicker would look like. In the later case, I have used a slow InfiniBand connection that is only about 2.5 times quicker than 1000Mbps ethernet. Bear in mind that :-

  1. Ethernet signals work at 100Mbps or 1000Mbps (or faster for more esoteric Ethernet types), but you
    won’t get to that speed.

  2. You need to baseline a performance test to find out how quick normal speeds look like!
Aug 202013
 

Every so often I come across an old Linux box that doesn’t take kindly to being rebooted. Without console access, it is hard to see what is going on, but the Linux kernel gets stuck trying to mount the root file system. There are many possible fixes for this, but they all have one thing in common … a work-around has to be performed to get the box up and running.

The console gets stuck in a “mini-root” environment loaded when the initrd image is loaded and before the real root file system is mounted which means a lot of commands are not available, but lvm should be available. First of all, run lvm lvscan to get a list of the logical volumes that need activating :-

(initramfs) lvm lvscan
  inactive          '/dev/sys/root' [332.00 MiB] inherit
  inactive          '/dev/sys/usr' [8.38 GiB] inherit
  inactive          '/dev/sys/var' [2.79 GiB] inherit
  inactive          '/dev/sys/swap_1' [7.05 GiB] inherit
  inactive          '/dev/sys/tmp' [380.00 MiB] inherit
  inactive          '/dev/sys/home' [16.00 GiB] inherit
  inactive          '/dev/sys/opt' [24.00 GiB] inherit

For each volume group (the second column, middle word), run: lvm lvchange -ay ${volume-group-name}. In the case of my example :-

(initramfs) lvm vgchange -ay /dev/sys
  7 logical volume(s) in volume group "sys" now active

At which point you should be able to press ^D (or enter exit) to continue the boot process.

A slighter better work-around involves changing the Grub configuration to add a delay to the kernel parameters. This sections assumes that you are not using Grub Legacy!

Start by editing /etc/default/grub and changing the variable GRUB_CMDLINE_LINUX to include “rootdelay=20” :-

GRUB_CMDLINE_LINUX='console=tty0 console=ttyS0,19200n8 rootdelay=20'

Finalise by running update-grub. This adds a 20s delay to the boot process so is hardly an ideal solution.

Aug 192013
 

No.

Anyone who thinks so needs to read a bit of history on what life was like in real police states.

But on a day when news of an incident where a journalist was detained for 9 hours and his electronic media confiscated, we do have to ask ourselves whether we are headed in that direction. And whether we really want to go in that direction.

David Miranda was held under anti-terrorist legislation – specifically schedule 7 – in what was clearly an attempt at harassment for publishing stories embarrassing the UK and US governments. Now the victim here is clearly a journalist, and whilst it is possible for a journalist to be involved in terrorism, I really rather doubt this one has time to be particularly active at this time. This is a high profile case, but how many of the 61,145 other suspects detained under schedule 7 last year were detained for non-terrorism purposes?

Anti-terrorism legislation is very powerful, and whilst it may be justified to tackle terrorism, it certainly must not be used for other purposes. And in this case it was.

And undoubtedly we will have some sort of review of the case, a lot of noise, and very little action. It’s almost certain that the police who detained David Miranda will escape scot free, or with a notional slap on the wrist, and not with a prison sentence that they deserve.

Jul 292013
 

… which is of course massive overkill. But fun. It should increase the raw bandwidth available between the two machines from 1Gbps to 20Gbps (with one link) and 40Gbps with both links bonded. It was a bit of a surprise to me when I looked around at prices of second-hand kit to realise that InfiniBand was so much cheaper to acquire than Fibre Channel; the kit I acquired cost less than £100 all in whereas FC kit would be in the region of £1,000, and InfiniBand is generally quicker. There is of course 16Gb FC and 10Gb InfiniBand, but that is hardly comparing like with like. So what is this overkill for? Networking of course. I’ve acquired two HP InfiniBand dual link cards which means I can connect my workstation to my server :- InfiniBand Network Using dual links is of course overkill on top of overkill, but given that these cards have dual links, why not use them? And it does give a couple of experiments to try later. To prepare in advance, the following network addresses will be used :-

Server Link Number IPv4 Address IPv6 Address
A 1 10.255.0.1 AAISP:d00d::1
A 2 10.255.1.1 AAISP:d00f::1
B 1 10.255.0.254 AAISP:d00d:2
B 1 10.255.1.254 AAISP:d00f:2

Yes I have cheated for the IPv6 addresses! The first step is to configure each “server” … one is running Debian Linux, and the other is running FreeBSD.

Configuring Linux

This was subject to much delay whilst I believed that I had a problem with the InfiniBand card, but putting the card into a new desktop machine caused it to spring back to life. Either some sort of incompatibility with my old desktop (which was quite old), or some sort of problem with the BIOS settings.

Inserting the card should load the core module (mlx4_core) automatically, and spit out messages similar to the following :-

[    3.678189] mlx4_core 0000:07:00.0: irq 108 for MSI/MSI-X
[    3.678195] mlx4_core 0000:07:00.0: irq 109 for MSI/MSI-X
[    3.678199] mlx4_core 0000:07:00.0: irq 110 for MSI/MSI-X
[    3.678204] mlx4_core 0000:07:00.0: irq 111 for MSI/MSI-X
[    3.678208] mlx4_core 0000:07:00.0: irq 112 for MSI/MSI-X
[    3.678212] mlx4_core 0000:07:00.0: irq 113 for MSI/MSI-X
[    3.678216] mlx4_core 0000:07:00.0: irq 114 for MSI/MSI-X
[    3.678220] mlx4_core 0000:07:00.0: irq 115 for MSI/MSI-X
[    3.678223] mlx4_core 0000:07:00.0: irq 116 for MSI/MSI-X
[    3.678228] mlx4_core 0000:07:00.0: irq 117 for MSI/MSI-X
[    3.678232] mlx4_core 0000:07:00.0: irq 118 for MSI/MSI-X
[    3.678236] mlx4_core 0000:07:00.0: irq 119 for MSI/MSI-X
[    3.678239] mlx4_core 0000:07:00.0: irq 120 for MSI/MSI-X
[    3.678243] mlx4_core 0000:07:00.0: irq 121 for MSI/MSI-X
[    3.678247] mlx4_core 0000:07:00.0: irq 122 for MSI/MSI-X
[    3.678250] mlx4_core 0000:07:00.0: irq 123 for MSI/MSI-X
[    3.678254] mlx4_core 0000:07:00.0: irq 124 for MSI/MSI-X
[    3.678259] mlx4_core 0000:07:00.0: irq 125 for MSI/MSI-X
[    3.678263] mlx4_core 0000:07:00.0: irq 126 for MSI/MSI-X
[    3.678267] mlx4_core 0000:07:00.0: irq 127 for MSI/MSI-X
[    3.678271] mlx4_core 0000:07:00.0: irq 128 for MSI/MSI-X
[    3.678275] mlx4_core 0000:07:00.0: irq 129 for MSI/MSI-X

This is just the core driver; at this point additional modules are needed to do anything useful. You can manually load the modules with modprobe but sooner or later it is better to make sure they’re loaded automatically by adding their names to /etc/modules. The modules you want to load are :-

  1. mlx4_ib
  2. ib_umad
  3. ib_uverbs
  4. ib_ipoib

This is a minimal set necessary for networking (“IP”) rather than additional features such as SCSI. It’s generally better to start with a minimal set of features initially. At this point, it is generally a good idea to reboot to verify that things are getting closer. After a reboot, you should have one or more new network interfaces listed by ifconfig :-

ib0       Link encap:UNSPEC  HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ib1       Link encap:UNSPEC  HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Despite the appearance, we still have quite a way to go yet. The next step is to install some additional packages: ibutilsinfiniband-diags, and opensm. The last package is for a subnet manager which is unnecessary if you have an InfiniBand switch (but I don’t). The first step is to get opensm up and running. Edit /etc/default/opensm and change the PORTS variable to “ALL” (unless you want to restrict the managed ports, and make things more complicated). And start opensm: /etc/init.d/opensm start; update-rc.d opensm defaults.

At this point, you can configure the network addresses by editing /etc/network/interfaces. If you need help doing this, then you’re in the tech pool beyond your depth! Without something at the other end, these interfaces won’t work (obviously), so it’s time to start work on the other end …

Configuring FreeBSD

See: https://wiki.freebsd.org/InfiniBand I hadn’t had cause to build a custom kernel before, so the very first task was to use subversion to checkout a copy of the FreeBSD source code :-

svn co svn://svn0.us-east.FreeBSD.org/base/stable/9 /usr/src

Updating will of course require just: cd /usr/src && svn update. Once installed, create a symlink from /sys to /usr/src/sys if the link does not already exist: ln -s /usr/src/sys /sys

Go to the kernel configuration directory (/usr/src/sys/amd64/conf), copy the GENERIC configuration file to a new file, and edit the new file to add in certain options :-

# Infiniband stuff (locally added)
options         OFED
options         IPOIB_CM
device          ipoib
device          mlx4ib

Again, this is a minimal set that will not offer full functionality … but should be enough to get IP networking up and running. The next step is to build and install the kernel :-

make buildkernel KERNCONF=${NAME-OF-YOUR-CONFIG}; make installkernel KERNCONF=${NAME-OF-YOUR-CONFIG}

The next step is to build the “world”  :-

  1. Edit /etc/src.conf and add “WITH_OFED=’yes'” to that file.
  2. Change to /usr/src and run: make buildworld
  3. Finalise with make installworld

As it happens I had to build the user-land first, as the kernel compilation needed a new user-land feature.

After a reboot, the new network interface(s) should show up as ib0 upwards. And these can be configured with an address in exactly the same as any other network interface.

Testing The Network

A tip for making sure the interfaces you think are connected together is to configure one of the machines, send a broadcast ping to the relevant network address of each interface in turn, and run tcpdump on the other machine to verify that the packets coming down the wire match what you expect.

Below the level of IP, it is possible to run an InfiniBand ping to verify connectivity. First you need a GUID on “the server”, which can be obtained by running ibstat and looking for the “Port GUID”, which will be something like “0x0002c90200273985”. Next run ibping -S on the server.

Now on the other machine (“the client”), run ibping :-

# ibping -G 0x0002c90200273985
Pong from polio.inside.zonky.org (Lid 3): time 0.242 ms
Pong from polio.inside.zonky.org (Lid 3): time 0.153 ms
Pong from polio.inside.zonky.org (Lid 3): time 0.160 ms

The next step is to run an IP ping to one of the hosts. If that works, it is time to start looking at something that will do a reasonable attempt at a speed test.

This can be done in a variety of different ways, but I chose to use nttcp which is widely available. On one of the hosts, run nttcp -i to act as the “partner” (or server). On the sending server, run nntcp -T ${ip-address-to-test} which will give output something like :-

# nttcp -T 10.0.0.26
     Bytes  Real s   CPU s Real-MBit/s  CPU-MBit/s   Calls  Real-C/s   CPU-C/s
l  8388608    0.70    0.01     95.7975   5592.4053    2048   2923.51  170666.7
1  8388608    0.71    0.04     94.0667   1878.6950    5444   7630.87  152403.3

According to the documentation, the second line should begin with ‘r’, but for a simple speed test we can simply average the numbers in the “Real-MBit/s” to get an approximate speed. Oddly my gigabit ethernet seems to have mysteriously degraded to 100Mbps! At least it makes the InfiniBand speed slightly more impressive :-

# nttcp -T 10.255.0.2
     Bytes  Real s   CPU s Real-MBit/s  CPU-MBit/s   Calls  Real-C/s   CPU-C/s
l  8388608    0.03    0.00   2521.9415  16777.2160    2048  76963.55  512000.0
1  8388608    0.03    0.03   2206.6574   2568.6620    4032 132579.25  154329.0

Before getting into a panic over what appears to be a pretty poor result, it is worth bearing in mind that IP over InfiniBand isn’t especially efficient, and InfiniBand seems to suffer from marketing exaggeration. From what I understand, DDR’s 20Gbps signalling rate becomes 16Gbps, which in turn becomes 8.5Gbps when looking at the output of ibstatus (not ibstat) – why the halving here is a bit of a mystery, but that may become apparent later.

There has also been a hint that FreeBSD is due for a significant improvement in InfiniBand performance sometime after the release of 9.2.

As a late addition, it would appear that running OpenSM (the subnet manager) on both hosts means that when one or other is rebooting, the other can take over the duties of the subnet manager. To enable on FreeBSD, simply add opensm_enable=”YES” to the file /etc/rc.conf and reboot.

Jul 282013
 

Having recently assisted with getting my sister’s business web site online (the domain and DNS side of things), it occurred to me that many people assume that it is all the same thing. Which is most definitely not the case, and believing so leaves you open to being ripped off. It is not unknown for hosting companies to “make it easy” for you to perform the whole job of registering a domain, making the relevant DNS changes, and setting up your web site. Often by hiding as much of the detail as possible.

That would be fine if that were all they did, but it isn’t. Sometimes they go out of their way to imply that if you want to change hosting companies, then you have to get a new domain, and it can take quite a bit of digging to find out how to transfer the domain elsewhere. Now it is easy to think that this doesn’t matter too much, but the longer you use a domain, the more you want to stick with it. Especially if you have a blog site that is not entirely unpopular … an older domain with lots of content has value. And the more successful you are, the more likely you are to want to change hosting companies.

Other hosting companies may offer better value on web sites with lots of visitors, or perhaps you are blogging in a controversial and need some sort of added protection against hackers, or you just “grow out” of a simple web site editing tool and want to get down and dirty with the HTML.

So it may well be worth your time finding out a little bit about this stuff in advance, and registering a domain separately to the web hosting. Or get a friendly geek to do so.

The Web Site

At their very simplest, web servers are nothing more than simple file servers. A web browser asks for an object (“give me http://zonky.org/index.html”), and the web server responds with the object (“It’s an HTML object, and here it is.”). Even on the most sophisticated web sites, the overwhelming majority of objects that make up the web page that you see, are simple files. And when you graduate to advanced features such as server-side languages (PHP, Java, etc.), the conversation between the web browser is still relatively simple one consisting mostly of simple requests for objects.

The key part of asking a web server for an object is that you need some sort of identifier for that object. This is known as the URL … Uniform Resource Locator, which could be called a “web address” (although URL is also used for non-web things). The key part (as far as we’re concerned here) is the host part of the URL. This performs two functions :-

  1. If it is not already a network address (IP address), then the web browser uses the DNS to get a network address. This is used to determine what web server to talk to.
  2. It is also included in the request to the web server. This allows the web server to distinguish between different “virtual servers”.

When it is not a network address, the host part of the URL is also known as the domain. Which brings us to the next topic.

The Domain

Domains and the DNS are very tightly associated, but in theory you could register a domain without having a DNS service. In practice it is rarely done … and almost always when you are running your own DNS service, in which case you will probably not be reading this!

But domains are distinct from the DNS service. The process of registering a domain consists of picking a name, and a top-level domain in which to add that name. It is most common to use .com for an international business (or one that wants to become one), a local business domain (such as .co.uk) for a more local business, .org for a generic organisation, etc. You can be creative with your choice of top-level domain, where the top-level domain becomes part of the entire name – such as http://bit.ly/, but that makes the registration process trickier.

Some top-level domains restrict who can register domains – the .ac.uk domain for example is really just for Academic organisations, and JANET is quite restrictive about who can register a domain with them.

When you register a domain, some of the information that you provide (such as name, address, phone number, etc.) is made public by default! Whilst you can often hide this, you may want to consider whether that is wise … domains with hidden registration information are often used by those for nefarious purposes. That is a domain with public information has a higher reputation than those without.

The Domain Name System (DNS)

The DNS is a service that allows you lookup names. It is usually used to turn named (such as zonky.org) into network addresses (81.2.106.111 or 2001:8b0:ca2c:dead::d00d), but there are also other kinds of records. A hosting provider often hides all this extra detail from you, but not always. It’s easy to overlook that with one domain, it is perfectly possible to have as many names as you have the imagination for – for example, my own domain (zonky.org) has a web site that is very rarely visited, there is also a different web site at www.zonky.org, and a separate blog at really.zonky.org (plus a few others). This can be handy if you want different web sites for different purposes – a normal web site for a business, an additional web site for a blog (to publicise the business), a forum site for customer support, etc.

Each of which could be hosted with a different hosting company!

Most of these “other” DNS record types are not of interest if all you are interested in are web sites, but one – the CNAME – may be useful. It allows you to give an “alias” to another name – i.e. make www.zonky.org point to zonky.org. If you have a web site with multiple names – for example a web site that responds at your domain name (zonky.org), and your domain name with “www” added to the front (www.zonky.org) – then it may be better to use a CNAME for the “www” to point to your domain. This is simply so that you only have to enter the network address of your web site in one location, and only update it once.

However “aliases” can only exist as aliases … there can be no other additional records associated with that name. Your domain name (zonky.org) has at least one other DNS record associated with it, so you cannot use an alias here.