The first step to understanding programming for networks is, of course, understanding networks. Defining what they are, clarifying the aspects of networks we are concerned with, addressing how network architecture impacts the programs we write, and what kinds of software solutions networks need to be effective.
At its most basic, a network is nothing more than a physical implementation of an undirected graph; a series of nodes and edges, or connections, between those nodes, as demonstrated in the following diagram:
A basic, undirected graph
However, the preceding diagram doesn't quite capture the full picture. What constitutes a node, and what is sufficient for a connection are all very pertinent details to clarify. An individual node should probably be able to meaningfully interact with other nodes on the network, or else you might have to concern yourself with programming for a potato connected by two wires to a network of routers and servers. It's safe enough to say that potatoes very obviously aren't nodes, and that an active and stable Azure server very obviously is, so the line delineating nodes from non-nodes on a network falls somewhere between those two poles. Likewise, we can easily identify that the cables from the power supply of a computer to the outlet in a wall don't constitute a network connection, but that a CAT-5 cable from our computer to a router obviously does. The dividing line probably falls somewhere between those two, and it is important that we take care to draw that line accurately.
We'll start with a workable definition of networks for the purposes of this book, unpack the definition, and examine why we chose to make the specific distinctions we have, and finally, consider what each essential property of a network means to us as programmers. So, without further ado, the definition of a computer network is as follows:
A computer network is, for our purposes, an arbitrarily large set of computational or navigational devices, connected by channels of communication across which computational resources can be reliably sent, received, forwarded, or processed.
On the surface, that might seem basic, but there is a lot of nuance in that definition that deserves our consideration. So, let's take a deeper dive.
What do we mean when we say arbitrarily large? Well when you're writing software for a router (accepting that you would realistically be bound by the maximum size of physically-addressable space), you would not (and should not) care about how many devices are actually connected to your hardware, or how many routes you need to reliably pass resources or requests along. Suppose you are writing the system software for a wireless router. While doing so, you tell your product owner that their marketing copy should specify that this router can only connect a maximum of four computers to the internet. Can you imagine any product owner would take that news kindly? You would be looking for a new job in no time! Networks must be able to scale with the needs of their users.
A basic property of almost all computer networks is device-agnosticism, which is to say that any device on a network should assume no knowledge of the number or kind of other devices on that network at any given moment. Indeed, a program or device might need to discern whether or not a specific device or piece of software exists on the network, but nothing about the network connection it obtains will convey that information. Instead, it should be equipped to send and receive messages in a format that is typically standardized for the communication protocol over which the messages are sent. Then, using these standardized messages, a device can request information about the availability, configuration, or capabilities of other devices on the network, without actually knowing whether or not the devices it expects to be available on the network are, in fact, available on the network.
Ensuring that the receiving end of any given outgoing connection from a device is properly connected, or that the receiving devices are configured accordingly, is the concern of the network engineers who support your software. Supporting and responding to requests sent by your software is the responsibility of the authors of the receiving software. Obviously, if you're working in a sufficiently small software shop, both of those roles may well also be filled by you; but in a sufficiently mature working environment, you can likely rely on others to handle these tasks for you. However, when the time comes to deploy your software to a networked device, no information about whether or not those responsibilities were handled properly is available to you simply by virtue of being connected to a network.
Device-agnosticism means that a network has no idea what has connected to it, and, accordingly, cannot tell you as much. A corollary attribute of networks is that other devices on the network cannot and will not be notified that your device or software has connected and been made a resource.
Ultimately, this is what is meant by an arbitrarily large set of devices. Technically, a single computer constitutes a network of one node, and zero connections (though, for the purposes of this book, we'll only be considering networks with at least two nodes, and at least one connection between any given node and any other node on the network), but there is no fixed maximum value of nodes beyond which a network ceases to be a network. Any arbitrary number of nodes, from one to infinity (or whatever the maximum number of physically possible nodes may be), constitutes a valid network, so long as those nodes have some valid connection between themselves and the rest of the network.
Now that we know we can have an arbitrarily large number of computational devices as nodes in our network, it bears further scrutiny to discern what exactly a computational device is. While this may seem obvious, at least initially, we can quickly identify where it becomes unclear by way of an example.
As per our definition of a network, the device I'm using to draft this book right now might well qualify as a self-contained network. I have a keyboard, mouse, monitors, and computer, all connected by standardized channels of communication. This looks awfully network-like at a conceptual level, but intuitively, we would be inclined to say that it is a network of one, and so, not really a network at all. However, while the what of the non-network status of my computer seems obvious, the why might be less clear.
This is where we benefit from clearly explicating what constitutes a computational device for the purposes of our definition of a network. Simply being able to perform computation is insufficient for a network node. In my example, I can tell you that my mouse (a relatively high-end gaming mouse) certainly performs a number of complex calculations transforming a laser-sensor signal into directional inputs to pass to my computer. My monitors certainly have to do a fair amount of computation to transform the raw binary data of pixel color values into the rendered screens I see 60 or 120 times per second. Both of these devices are connected by way of reliable, standardized communication protocols to my machine, but I wouldn't necessarily be inclined to consider them nodes on a network. My computer, when connected to the internet, or my local home network, surely constitutes a node, but its individual peripherals? I'm inclined to say no.
So, if peripherals aren't netwo...