Early in the history of computing, collections of physical components which make up a computer began to be called âhardwareâ. Software was then coined to refer to the non-physical aspects of the computer, especially the programs, or sets of instructions âwhich cause a computer to perform a desired operation or series of operationsâ.1 Such a definition is useful to a certain extent, but overlooks the complex relationship of hardware and software, particularly when an externally stored program2 is loaded into the memory of a computer. It also encourages a focus on the end product of the software development process â executable code â at the expense of other artefacts which exist at intermediary stages; and which in some sense determines the final product.
So to truly understand the nature of software, it is necessary to understand how the development of software has evolved since the early days, as much of this evolution has determined the way in which software developers currently operate. It is then possible to see how software developers take an abstract idea and develop it into working software through the paradigm known as top-down programming. It will also be possible to make some general observations about what makes software unique.
2.1 The evolution of modern software
Table 1.1 Example machine code instruction
A Machine code
Machine code, or native code, is the medium in which all software was originally written, and the form in which software ultimately executes on a computer. Machine code is a string of 1s and 0s (called bits) of a set length which tell the computerâs central processing unit (CPU) which instruction to perform, where to get the data and where to put the result. An example of a 32-bit instruction is given below:3 This instruction tells the CPU to add the contents of memory location 2 to memory location 3 and put the result in memory location 4. The available instructions set varies betweeen processors, but typically includes such operations as:
- arithmetic operations, such as add and subtract;
- logic instructions, such as and, or and not;4
- data instructions, such as move, input, output, load, store;5 and
- control flow instructions, such as goto, if ⌠goto, call, and return.
Each instruction has its own corresponding binary number. This can make writing programs in machine code tedious and time-consuming. Errors are easy to make and hard to find. Further, the mappings of instruction to number can vary from processor to processor. This means there is little portability between machines.6
So early on, programmers began to look for ways of moving towards more human-readable representations of computer instructions.
B Assembly language
The next development was to write a program which could take care of converting human-readable instructions to number strings, called an assembler/disassembler. The assembly language version of the machine code instruction above looks like this:
âADDâ is obviously more memorable than 100000, but there is still a one-toone correspondence between assembler instructions and machine code instructions. Seemingly simple operations require a number of machine-level instructions,7 and assembly programmers still have to work within the constraints of hardware-specific instruction sets.8 So they set to work on hiding the low-level details of the computer in another way.
C High-level languages
Most programming these days is done using high-level languages, which are another step closer to natural language. High-level languages also hide the machine architecture from the programmer. A high-level representation of the assembly code above would be:
When using high-level languages, programmers no longer have to worry about memory locations, architecture-specific instructions, or other low-level hardware details. In order to be executed by the computer, however, high-level language commands need to be translated into machine code. As such, high-level languages are often classified according to how this translation is achieved.
D Compilers
Compiled languages cannot be run directly on a computer in their original (source code) form. Before they can be executed, a translation program (compiler) translates the high-level language instructions to machine-specific instructions (the result of compilation being known as object code). Common compiled languages include C and Java.
E Interpreters
Interpreted languages do not have this preparatory step. A program called an interpreter translates high-level code to machine code as the code is run. Modern examples of interpreted languages include Python, Perl, PHP and Ruby. Since no compilation is involved, the development and testing of software written in these programs is easier. The downside of interpreted languages is that they generally run slower and require more memory.
F Software stacks
But the abstraction of the software development process does not end with natural languages. A different type of abstraction is achieved by breaking software down into a series of layers, called a stack, where each layer offers the layer above access to its services, but only via a set of higher-level functions. The typical layers involved in the operation of a modern computer are discussed below.
Firmware, kernels, and operating systems: Firmware exists right at the borderline between hardware and software, and can be thought of as a computer program embedded in a hardware device. The most familiar role for firmware in modern computers is as the Basic Input Output System (or BIOS) which launches a computerâs startup process by detecting the installed hardware elements, then loading the operating system.
Operating just above the level of the firmware is an operating systemâs kernel. The kernel handles the lowest level of interactions with the hardware of a computer. The most commonly known kernel is Linux, the kernel of a GNU/Linux operating system. The job of the kernel is to âmanage the computerâs resources and allow other programs to run and use these resourcesâ.9 These resources include the CPU, memory (RAM) and various input/output devices such as displays, disk drives, mice, and keyboards. Since there is usually more than one program running at any time on a computer, the kernel can be thought of as deciding...