If you suspect a networking problem, how do you go about diagnosing that problem?
As in all problem solving, the process involves gathering information and performing tests. To adequately perform some of the tests, you need to prepare in advance – by obtaining copies of tools, creating a USB stick with the tools on, finding out how to use the tools, etc. You cannot expect to be able to perform anything useful without investing in that preparation time.
As an alternative to preparing a USB stick full of tools, it may be preferable to prepare a netbook with the tools on – at the very least swapping a network connection to a known working netbook will tell you whether the problem is in the computer or in the network!
Get The MAC Man!
The MAC address of the network connection is probably the single most important bit of information to get your hands on. Because it is the key for obtaining lots of other information – whether dhcp requests are being seen, whether the Ethernet switch can see that MAC address on any of it’s ports … or the expected port, etc. If you report a network issue without the MAC address of the machine in question, someone will bang their head on the desk. If you are locked out of the machine because the network “isn’t working”, and so are unable to run the usual tools to get at the MAC address, report that as a fault.
Obtaining the MAC address varies according to the operating system you want to get it from, and the method you choose to use to get it. I have chosen to document a command-line method; if this makes you unhappy, please feel free to document the graphical way, and I’ll add a link to it. In some cases, you will be choosing which MAC address is relevant to the active network card. If in any doubt, get all the MAC addresses, and suggest which one you think is the active network card; if it turns out you have guessed wrong, at least the right one will be in the list somewhere!
Windows
Start a command line, and run ipconfig :-
C:\Users\msm>ipconfig/all Windows IP Configuration Host Name . . . . . . . . . . . . : w7 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Peer-Peer IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : single-names.port.ac.uk iso.port.ac.uk eps.is.port.ac.uk Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : inside.zonky.org Description . . . . . . . . . . . : Intel(R) PRO/1000 MT Desktop Adapter Physical Address. . . . . . . . . : 08-00-27-84-0A-B4 DHCP Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 10.0.2.15(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Lease Obtained. . . . . . . . . . : 15 September 2013 12:02:40 Lease Expires . . . . . . . . . . : 16 September 2013 12:02:40 Default Gateway . . . . . . . . . : 10.0.2.2 DHCP Server . . . . . . . . . . . : 10.0.2.2 DNS Servers . . . . . . . . . . . : 10.0.0.26 NetBIOS over Tcpip. . . . . . . . : Enabled Tunnel adapter Local Area Connection* 9: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv6 Address. . . . . . . . . . . : 2001:0:9d38:953c:2c67:1675:f5ff:fdf0(Pre erred) Link-local IPv6 Address . . . . . : fe80::2c67:1675:f5ff:fdf0%11(Preferred) Default Gateway . . . . . . . . . : :: NetBIOS over Tcpip. . . . . . . . : Disabled Tunnel adapter isatap.inside.zonky.org: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : inside.zonky.org Description . . . . . . . . . . . : Microsoft ISATAP Adapter #2 Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes
Windows is being “helpful” here and listing all of the network adapters it knows of. Including the ones that are not plugged in. To find the address we want, we look for the “Ethernet adapter Local Area Connection” section, and within that look for the “Physical Address” which is given here as 08-00-27-84-0A-B4
Linux and OSX
Linux and OSX are pretty similar at this level – with the exception that linux calls Ethernet devices ethN (usually), and OSX calls ’em enN, the command and output is pretty much the same.
Again, start a command-line interface and run the command ifconfig :-
% ifconfig eth0 Link encap:Ethernet HWaddr 60:a4:4c:62:84:71 inet addr:10.0.0.28 Bcast:10.0.255.255 Mask:255.255.0.0 inet6 addr: fe80::62a4:4cff:fe62:8471/64 Scope:Link inet6 addr: 2001:8b0:ca2c:dead::babe/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1492 Metric:1 RX packets:170663945 errors:0 dropped:0 overruns:0 frame:0 TX packets:183200664 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:122771869945 (114.3 GiB) TX bytes:170314898179 (158.6 GiB) Interrupt:73 Base address:0x2000 ib0 Link encap:UNSPEC HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:10.255.0.1 Bcast:10.255.0.255 Mask:255.255.255.0 inet6 addr: fe80::21a:4bff:ff0c:e1c5/64 Scope:Link inet6 addr: 2001:8b0:ca2c:d00d::1/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:4096 Metric:1 RX packets:6037892 errors:0 dropped:0 overruns:0 frame:0 TX packets:12155324 errors:0 dropped:3079 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:314594872 (300.0 MiB) TX bytes:21890697854 (20.3 GiB) ib1 Link encap:UNSPEC HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:10.255.1.1 Bcast:10.255.1.255 Mask:255.255.255.0 inet6 addr: fe80::21a:4bff:ff0c:e1c6/64 Scope:Link inet6 addr: 2001:8b0:ca2c:d00f::1/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:4466937 errors:0 dropped:0 overruns:0 frame:0 TX packets:429108 errors:0 dropped:47 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:232871358 (222.0 MiB) TX bytes:17179018366 (15.9 GiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:65309 errors:0 dropped:0 overruns:0 frame:0 TX packets:65309 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:4625183 (4.4 MiB) TX bytes:4625183 (4.4 MiB)
This is an unusually complex configuration, but the MAC address can be picked out relatively easily. Just look for the ethN (here it’s “eth0”) and pick out the “HWaddr” which is 60:a4:4c:62:84:71 in this example.
Network Sniffing with Wireshark
Wireshark is a premium graded packet sniffer and packet analysis tool; it’s a tool that really justifies a complete book. But you can get quite a bit done with far less knowledge.
The absolute basic is how to capture packets. This should be fairly easy to accomplish from the graphical interface – it’s pretty much a case of picking a network interface to capture from, and clicking “Start”. Once you have captured 30 seconds or so of traffic, click the red cross, and save the result. All done.
Warning: Contains an enthusiastic American!
Detailing exactly what you might see in a packet capture is definitely beyond the scope of this blog entry, but there are basically three different kinds of packets you should see :-
- Packets sent by your machine. Until you get to more advanced levels, these contain very little in the way of useful information.
- Packets sent to your machine. That is they are addressed specifically with your machine in mind. These are also to be ignored at this level.
- Finally packets sent out in broadcast mode – to every machine on the network.
The final category can tell you on which network you are … if you are connected to some kind of “special” private network, it is to be expected that an ordinary PC (or Mac) won’t work properly. If you look at enough examples of packet captures, it should eventually become evident what packets are broadcast ones, and what the contents of those packets mean :-
# tshark -i eth0.24 arp tshark: Lua: Error during loading: [string "/usr/share/wireshark/init.lua"]:45: dofile has been disabled Running as user "root" and group "root". This could be dangerous. Capturing on eth0.24 0.000000 84:78:ac:19:64:41 -> Broadcast ARP 60 Who has 148.197.24.2? Tell 148.197.24.252
The packet in question is on the last line. It’s an ARP packet where a machine is asking if anyone knows the Ethernet address of 148.197.24.2 … which is a pretty good indication you are connected to that network.
A Better Ping
The standard ping tool is very useful, but it has a couple of one big disadvantages :-
- Machines with an aggressive firewall may not permit ICMP (i.e. ping) packets through. In which case they do not respond to standard pings.
- Because ping uses ICMP packets, it is subject to the lower priority that ICMP packets have … in the event of an overloaded network, routers and switches will prefer to drop ICMP to keep TCP and UDP packets flowing. This can result in a false impression of the network reliability.
Because of this, there have been a variety of different tools that accomplish the same sort of thing as ping by using either TCP or UDP (or even ICMP) packets. The latest and greatest of these tools is nping which is part of the nmap series of tools, and is available for just about every platform (including Windows). The default for nping is to send just 5 packets :-
# nping --tcp -p 22 10.0.0.28 Starting Nping 0.6.25 ( http://nmap.org/nping ) at 2013-09-23 20:56 BST SENT (0.0058s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40 seq=3907311816 win=1480 RCVD (0.0062s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44 seq=4059830626 win=14520 SENT (1.0060s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40 seq=3907311816 win=1480 RCVD (1.0066s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44 seq=4075461628 win=14520 SENT (2.0070s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40 seq=3907311816 win=1480 RCVD (2.0075s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44 seq=4091100198 win=14520 SENT (3.0080s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40 seq=3907311816 win=1480 RCVD (3.0084s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44 seq=4106740813 win=14520 SENT (4.0090s) TCP 10.0.0.26:18384 > 10.0.0.28:22 S ttl=64 id=46091 iplen=40 seq=3907311816 win=1480 RCVD (4.0094s) TCP 10.0.0.28:22 > 10.0.0.26:18384 SA ttl=64 id=0 iplen=44 seq=4122381613 win=14520 Max rtt: 0.451ms | Min rtt: 0.259ms | Avg rtt: 0.342ms Raw packets sent: 5 (200B) | Rcvd: 5 (230B) | Lost: 0 (0.00%) Tx time: 4.00449s | Tx bytes/s: 49.94 | Tx pkts/s: 1.25 Rx time: 4.00476s | Rx bytes/s: 57.43 | Rx pkts/s: 1.25 Nping done: 1 IP address pinged in 4.01 seconds
The key information is displayed at the end … specifically the Max rtt (round trip time) which tells you how long it took for the slowest “conversation” to take place, and the “Lost” count of the number of packets lost. There are zillions of options to nping, but some of the most important include :-
Option | Description |
---|---|
–tcp | Use TCP probe mode, which is probably the preferred mode for testing |
-p N | Specify the destination port to probe. This can be either open (i.e. a service is running) or closed, but not firewalled. |
–count N | Tell nping how many packets to send. Increasing this can make the test much longer. |
–delay Nms | How long to wait between each packet. Always specify “ms” as a suffix to give you milliseconds. A delay of about 50ms is reasonable. |
There’s a great deal more to nping than just this, but it’s a start.
How Fast? How Slow?
Does the network connection feel slow? Just how slow does it feel? Measure it
It is not uncommon to find that a network performance issue is actually a performance issue of some other kind. Measuring the network performance can tell you whether it really is the network, or something else. To do so, you need the right tool; measuring with the wrong tool can result in very inaccurate measurements.
Often people resort to using ftp to transfer large files back and forth, which works well enough in normal circumstances, but at higher network speeds you can find yourself measuring the speed of a slow hard disk rather than the network performance. So use the right tool – such as iperf which is available for all major platforms including Windows.
There is an additional tool available for Windows which offers a graphical interface, but I am describing the command-line interface. Partially because that’s the way I am, but partially because it is dead simple.
To run iperf you need to have the software installed on a client machine and a server machine. To run on a server, simply :-
$ iperf -p 32765 -s
Specifying the port number isn’t normally necessary, but I suggest choosing a random port around 32,000 to avoid conflicts. Just remember the port number you use! And on the client :-
% iperf -p 32765 -c polio ------------------------------------------------------------ Client connecting to polio, TCP port 5001 TCP window size: 23.4 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.28 port 36114 connected with 10.0.0.26 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 113 MBytes 95.0 Mbits/sec % iperf -p 32765 -c 10.255.0.2 ------------------------------------------------------------ Client connecting to 10.255.0.2, TCP port 5001 TCP window size: 28.8 KByte (default) ------------------------------------------------------------ [ 3] local 10.255.0.1 port 43978 connected with 10.255.0.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 2.73 GBytes 2.35 Gbits/sec
I have admittedly cheated here by running two tests … to show what a normal 100Mbps ethernet speed should look like, and what something a bit quicker would look like. In the later case, I have used a slow InfiniBand connection that is only about 2.5 times quicker than 1000Mbps ethernet. Bear in mind that :-
- Ethernet signals work at 100Mbps or 1000Mbps (or faster for more esoteric Ethernet types), but you
won’t get to that speed. - You need to baseline a performance test to find out how quick normal speeds look like!