The Overlooked Treasure: IPX Protocol
The Trouble with DHCP
The story begins with the update of the network access management device at USTC.
The reason for the unified allocation of IP addresses in public internet access areas is that the IP address segments scattered and allocated by each building are not enough. A few years ago, the main force of the Internet was desktop computers and laptops, and it was impossible to keep them on all the time; but now everyone has smart terminals, possibly more than one, and will connect to the Wi-Fi wherever they go. Many places that were more than enough with the /24 address segment (256 IPs) have encountered situations where IP addresses cannot be allocated during peak periods. The size of USTC’s IP address pool is limited, and centralized allocation has solved the problem of insufficient addresses.
This was originally a happy event for everyone, but the new equipment brought new problems. The library query machine that uses network booting freezes after a period of time. The reason is that the IP addresses allocated during the startup phase of the parent system and the startup phase of the subsystem are different, and this difference is due to the bug of the BRAS network access management device.
Many problems within the LAN are related to DHCP, such as setting up private DHCP servers, forging a large number of DHCP clients to exhaust the IP address pool, and the recent bash vulnerability that allows DHCP servers to execute arbitrary code on clients. The DHCP server itself is also relatively complex, it needs to save the state of the allocated address. Once the DHCP server crashes and restarts, loses the state, the entire LAN will fall into chaos (for this, mechanisms such as ping detection need to be introduced).
Most people take it for granted that DHCP assigns IP addresses. But let’s think about it carefully, why must there be a special server in the LAN to assign IP addresses to everyone? The network card has a unique hardware MAC address, why not directly use the MAC address of the network card? Computer network textbooks will tell us that this is for routing, that is, IP addresses are allocated from top to bottom on a global scale, just like postal addresses. And the scattered MAC addresses are like mobile phone numbers, we obviously cannot easily find a person’s exact location based on the mobile phone number.
IPv6 = Network Number + Hardware Address
The textbook is right. But why don’t we combine these two solutions, use hierarchically allocated addresses in the wide area network on a global scale, and directly use the hardware MAC address in the LAN? This is how IPv6 works. The IPv6 protocol not only extends the 32-bit address to 128 bits, but also makes many detailed improvements, cleaning up many historical toxins in the IPv4 protocol. IP address allocation is one of them.
When an IPv6 client accesses the network, it will generate a link-local address for use in the LAN based on its hardware MAC address, so that communication in the LAN can be achieved. For example, if the hardware address is 00:d0:b7:27:2b:92, the generated link-local address is fe80::2d0:b7ff:fe27:2b92. That is to say, communication within the LAN does not depend on any “address allocation server”. In fact, in the world of IPv4, many operating systems will allocate a “link-local” IP address of 169.254.. according to the RFC3927 protocol when they detect no DHCP environment (for example, direct connection of two computers), of course, this process requires negotiation, unlike IPv6, which directly generates based on the hardware address.
The next thing an IPv6 client does is to send out a broadcast message to find neighbors, to see which neighbor is willing to act as its gateway, to provide access outside the LAN. This is Router Solicitation. The gateway in the LAN will reply with Router Advertisement, indicating “I am willing to provide you with access outside the LAN”, and tell the client the “network number” that is valid globally for this LAN (for example, when accessing the Internet in the Youth Class College, the network number is 2001:da8:d800:701). The client then remembers the address of the gateway, and concatenates the “network number” with the hardware MAC address of the network card to form a globally unique IPv6 address.
The network number of IPv6 is the same as that of IPv4, it is allocated from top to bottom on a global scale, so that any host in the world can send a packet to it, and the routers at all levels in the network only need to deliver this packet step by step like a post office. After arriving at its LAN, the packet will be delivered to the corresponding host according to the hardware address. (Of course, in fact, the gateway maintains a correspondence table of hardware addresses and IPv6 addresses, don’t mind these details…)
By the way: If the IPv6 client wants to contact a server in the LAN, it uses another pair of request and response messages: Neighbor Solicitation and Neighbor Advertisement, similar to the ARP protocol in IPv4. All these things in IPv6 are based on the ICMPv6 protocol, and ICMPv6 runs on top of the IPv6 protocol, no longer parallel to ARP and IPv4. The ICMPv6 protocol carries such important network functions, it is not just for ping network testing, so don’t block it when writing ip6tables firewall rules.
IPX = Network Number + Hardware Address + Port Number
Why didn’t anyone think of such a good design as IPv6 back then? This brings us to our protagonist today: the IPX protocol. It has basically become history. If you have heard of it, it is most likely in the game StarCraft or in a computer history book.
The “IP address” in the IPX protocol is 12 bytes, consisting of three parts: network number (32 bits), hardware address (48 bits), and port number (16 bits). The network number and hardware address are similar to the concept in the IPv6 protocol. The port number corresponds to the application program that needs to access the network, and each application program occupies one port number.
In the world of TCP/IP, we are not unfamiliar with port numbers. Port numbers in TCP and UDP protocols are also used to identify applications. The IPX protocol seems to just “lower” the port number of the transport layer (where TCP and UDP are located) to the IP layer. What’s the big deal?
Stubborn TCP
The role of TCP is to achieve reliable transmission in an unreliable network and to share network resources among different applications and hosts. It turns out that this is difficult to do well, and there is no unique “best” way to do it well. The TCP protocol has evolved over decades and has become more and more complex. The concept of port numbers used to distinguish applications is at the TCP layer rather than the IP layer, so applications on a host must share the same set of TCP implementations, that is, TCP is implemented in the kernel and cannot be modified by the application itself.
Many researchers in the field of networking complain that there are more than a thousand papers on TCP, but only a few of them can be adopted by mainstream systems. This is because TCP is not easy to modify in the kernel, and it is even more inconvenient to customize according to the needs of the application. Google can no longer tolerate the various problems of TCP, and recently began to promote QUIC based on UDP, which is equivalent to bypassing the transport layer protocol. Many domestic download software also use UDP and their own congestion control algorithms as transport protocols (of course, some rogue protocols that do not use congestion control should be criticized). If the port number was at the IP layer back then, how good it would be for the transport layer to be customized by the application!
In addition, from the perspective of control theory, a ring system is relatively stable. (In the figure below, S represents the remote server, R represents the application program)
TCP is implemented in the kernel, adding a buffer layer, which breaks this closed loop into two closed loops, which is not as stable as a single closed loop system. (In the figure below, K represents the kernel)
The IPX protocol is not a prophet of DHCP and network performance issues, but from the design, it puts the components of the protocol at the correct level, thereby avoiding possible problems. In other words, many tricky problems in the system are the sequelae of the initial misplacement of the protocol level.
However, we don’t have to be too pessimistic. Gold will always shine. With the demand for high-performance and high-speed networks by applications, more and more high-end network cards now support sockets in hardware, that is, hardware directly supports multiple ring buffers, each application program can access its own buffer, and reliable transmission and congestion control are all controlled by the application program. The network protocol stack in the operating system kernel has become a thin layer. The design concept of IPX “decoupling ports and transport protocols” has been reborn in high-performance networks.
Conclusion
Bad designs may be popular for a while, but good designs will not fade over time. The business world often only considers the current market, and often has no time to plan for the long term, which is also helpless for pure technologists. As a protocol designer, in order not to let your own impulsive design become a burden for thousands of coders in the future, be sure to read more about the history of computer systems and networks, and ask yourself a few more whys. However, without these patchwork designs, what would most coders live on?
References
- Hanhai Xingyun BBS
- Wikipedia: IPX/SPX
- RFC3927
- RFC4861
- Van Jacobson. Speeding up Networking.