Nov 152022
 

This is intended as a basic guide to how IP networking works at the basic level; traditionally such guides are for “dummies” but a lack of knowledge doesn’t make someone dumb but ignorant and being ignorant of one small esoteric part of computing is no crime. On the other hand, fixing that ignorance can help solve certain networking issues or at the very least make those domestic router settings make some sort of sense.

What is IP?

Whether we are talking about IPv4 (192.0.2.98/24) or IPv6 (2001:db8:1:1::1/64), these are both at a superficial level both “Internet Protocols”. Networks require some kind of physical networking layer underneath the IP layer – most commonly Ethernet for wired connections and WiFi for wireless connections, but it can run on many other networks – FDDI (historical), Fiber Channel, or InfiniBand. IP works above that level.

What made IP “special” for the time was the word internet; we think of this today as a world-wide network (“The Internet”) but back in the 1970s, the internet part of IP was about connecting multiple networks together with a gateway. Whilst network gateways existed before IP (and indeed after), they translated one network protocol into another and not infrequently were application specific (i.e. perhaps only allowed email through) whereas IP was build from the ground up to allow network traffic to traverse multiple network gateways.

There’s a whole lot of detail we could get into about the “protocol” bit of IP, but an an early stage all we need to know is that the packet of data contains an IP header which amongst other things specifies the source address, destination address, and a hint of what’s inside.

IP Addresses

An IP address (whether a source address or a destination address) is either a 32-bit integer (for IPv4) or a 128-bit integer (for IPv6). Which is the technical way of saying they’re just numbers; although we’re used to seeing (and using) a standardised representation of that number. Some tools will convert the representation of the address or will use the actual number :-

» ping 3221226082   
PING 3221226082 (192.0.2.98) 56(84) bytes of data.

So the usual representation of an IPv4 address is a “dotted quad” – four numbers between 0-255 which when converted to binary and concatenated make up the real network address.

On the other hand, IPv6 is somewhat longer and uses groups of four hexadecimal digits separated with colons (“:”) as in:

2001:0db8:0001:0001:0000:0000:0000:0001

Although it is usual to compress that with two simple rules – firstly a number of sequences of all zeros can be replaced once with “::” :-

2001:0db8:0001:0001::0001

Secondly, leading zeros within each ‘group’ can be dropped :-

2001:db8:1:1::1

Even shortened, IPv6 addresses are somewhat more complex than IPv4 addresses, but that’s why we have the DNS; in fact those used to using IPv4 addresses without the DNS should bear in mind that IPv4 addresses are also too complex for normal people to get right (and are subject to typos).

Netmasks

Whether you are assigned a public IPv6 address (2001:db8:cafe/48) or pick a private address from RFC1918 (10/8) you have a “choice” – you can either use the entire network range for one huge network, or you can sub-divide it into smaller subnets. Most home users will go with the first option but for most organisations of a reasonable size, the later is not just preferred but essential.

Whether you further divide a network into subnets or not, your computer still needs to know whether to send packets directly to the destination or whether to route those packets via a gateway or a router. This is done with a netmask that defines the network part of the address (and the host part of the address). If the network address of the destination matches the network address of the source, then the packets can be sent directly. Otherwise they’re sent via the gateway.

AddressLocal?
Netmask255.255.255.0
Source address192.0.2.98Yes
Destination 1192.0.2.73Yes
Destination 2172.16.1.13No

IPv6 addresses work in the same way except the addresses are longer.

Although netmasks are historically given as dotted quads making them look a bit like IPv4 addresses, it is becoming increasingly common to use a more compact method which is less error prone. The netmask is instead specified as the number of bits that the netmask covers – 192.0.2.98/24 rather than 192.0.2.98/255.255.255.0. As for IPv6, the same applies although “/64” is very often assumed – the default size for an IPv6 network is very much more strongly encouraged than for an IPv4 network (although it isn’t compulsory).

The Gateway Or The Router

Gateway or router? Well both – from the perspective of an ordinary host it’s a gateway to other networks; from the perspective of the gateway itself, it is a router connected to multiple networks (domestically often just two) and forwards packets on behalf of other computers.

In essence there is very little difference between a router and an ordinary machine except that the ordinary machine isn’t configured to forward packets, and it is usually configured with just a default gateway (sometimes called a gateway of last resort). Well and the route for the network it is connected to.

Both contain a routing table (or more than one) in the operating system kernel which basically consists of a set of network addresses and destinations (where to forward the packets to). In the case of your usual domestic router that usually consists of a route to your home network, a route to the ISP’s network, and a default route pointing at the ISP’s router.

When a machine wants to send (or forward) a packet to a destination, it picks the closest match in the routing table, and uses that as a intermediate destination to forward the packet to. Your machine operates this way; as does the core Internet routers (although they have slightly larger routing tables).

Some routers (probably a minority) are rather more complex of course. If you have heard of routing software such as BGP, OSPF, or IS-IS, then you have heard of software that distributes routing information. The larger Internet uses BGP to distribute routing information to add to routing tables around the world.

The description of routing so far has been rather hierarchical – your computer forwards to a default gateway, and it in turn forwards to your ISP’s default gateway. Which is a bit unfortunate as Internet routing doesn’t really work this way – there are alternate routes so if one router goes “bang” traffic can still reach the destination.

Dover Castle Gateway
Oct 192016
 

This is a bit of a thought experiment, so it may be not entirely correct (especially the maths – my probability theory is very rusty).

One of the lesser reasons for using the DNS rather than IP4 addresses is that typing mistakes are more easily caught – if you intend to type 192.168.67.52, but accidentally enter 192.168.67.53 instead, you still have a valid IPv4 address. Whereas entering the domain name wombar.example.com instead of wombat.example.com will most likely get you an error instead of sending your secrets off to an unknown location on your network – unless you have a rather silly server naming convention of course!

But how likely are you to make a mistake typing in an IPv4 address? According to a random web site “out there”, the average accuracy of a typist is 92%, or an average of 8 typos per 100 characters. If we convert this into a probability, we get a probability of typing each character correctly as 0.92.

Given that typing IPv4 addresses is something that some of us have a lot of practice at, and in many cases we will notice typos before they become a problem, I’m going to arbitrarily declare that the probability of getting any character within an IPv4 address correct is 0.999. But to type in an IPv4 address correctly we have to get a maximum of 15 characters correct :-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 9 2 . 1 6 8 . 1 2 8 . 1 2 8

So the probability of getting all those characters right is 0.999 (first character) x 0.999 (second character) … Or 0.999^15.

And once you work that out, subtract it from 1 (to get the probability of making a mistake) and convert it into a percentage, there is an 11% chance of making a typo in an IPv4 address.

For an IPv6 address such as 2001:db8:ca2c:dead:44f0:c3e9:28be:c903, which has 38 characters (no I’m not doing that silly table for IPv6) – 100 * (1 – 0.999 ^ 38) – 32%.

Now whilst my calculations may be a bit off, the likelihood of entering an IPv6 address incorrectly is nearly three times higher than the risk of entering an IPv4 address incorrectly.

In other words, with IPv6 you really need a good working DNS solution just to keep the errors to manageable levels.

dam-ip6

Mar 242013
 

The above links to an interesting browser which allows zooming and selection of different data sets. It’s worth a look if you’re into that sort of thing. Although it’s rather surprising that it doesn’t like IPv6 addresses!

The most controversial thing about this map of the Internet gathered during 2012, is that it was produced with the aid of a botnet or in other words this researcher stole the resources they needed. Which is obviously wrong – no matter how good the cause – but now that it has been done, there is no reason not to look at the results (whilst wrong this isn’t really evil).

The first interesting discovery here is that this anonymous researcher managed to write a simple virus that would load the Internet scanner onto many devices with default passwords set – admin accounts with “admin” as the password, root accounts with “root” as the password, etc. You would have thought that such insecure devices would have been driven off the Internet by now, but it turns out not to be the case – there are at least 420,000 of them!

You could even argue that the owners of such machines are asking to have their devices controlled by anyone who wants to. Perhaps a little extreme, but certainly some people think so or this Internet survey wouldn’t exist.

But now the results. If you look at the default settings in the browser above, you will encounter large swathes of black squares where apparently nothing is in use. The trouble is that whilst it is true that an IP address that is pingable, or has ports open is “in use”, there is no guarantee that an IP address that is just registered in the DNS is in use or not, and finally unregistered IP addresses that do not appear to do anything may very well still be in use.

Essentially the whole exercise hasn’t really said much about how much of the Internet address space is in use, although that is not to say that the results are not useful.

One special point to make is that many of the large black squares that appear unused, are allocated to organisations that may very well want to have proper IP addresses that are not connected to the global Internet. That is not wrong in any way – before the wide spread adoption of NAT, it was common and indeed recommended that organisations obtain a public IP address before they were connected to the Internet to avoid duplicate network addresses appearing. And an organisation that legitimately obtained an old “class A” has no obligation to return the “unused” network addresses back to the unallocated pool. And even if they did, it would not make a big difference; we would still run out of addresses.

The answer to the shortage of IPv4 addresses is IPv6.