PART 3
Programming advanced algorithms
Chapter 15
Recursion and data sorting
What is recursion? It is [...] nesting, and variations
on nesting. The concept is very general. (Stories
inside stories, movies inside movies, paintings inside
paintings, Russian dolls inside Russian dolls (even
parenthetical comments inside parenthetical
comments!)–these are just a few of the charms of
recursion.)
Douglas R. Hofstadter, Gödel, Escher, Bach (1979).
In this chapter we explore the concept of recursion. We show how efficient it is when applied to a simple problem, namely data sorting. A plain definition of a recursive function is a function which calls itself. The use of recursive functions is generally a delicate, but extremely efficient operation: indeed recursion optimally exploits computer features, but small programming errors may cause disastrous results. A good use of the recursive technique, even in case of very complex operations, allows one to create programs which are much shorter than those written without this technique.
Initially, it might seem less “natural” to use a recursive algorithm rather than a non-recursive one. This is probably due to the fact that we are essentially brought up with the sequential logic. Nevertheless, it is important to learn how to use the recursive logic. Anybody who ever tried to solve the problem of the “Tower of Hanoi” (box on page 453) surely agrees.
Let us consider the computer as a machine helping us to avoid repeating some simple operations many times. The advantage is to write a program of a restricted length compared to the number of operations that are finally executed by the computer. In this respect, the recursion is a program property which maximizes the ratio between the complexity of the executed machine operations and the programmer’s work. Indeed, programming a single simple function and calling it in a recursive way, works at different levels of the algorithm to achieve the desired final result. Very often recursive algorithms perform better than non-recursive ones. Therefore, we start this chapter by recalling some basic concepts about the analysis of the computational complexity of an algorithm. We also introduce some simple sorting algorithms, and classify them according to the computational resources they require.
15.1Analysis of the performance of an algorithm
An algorithm is a well-defined computational procedure transforming an input into an output. As an example, consider the problem of sorting data (discussed in this chapter). The input is a set of objects (for simplicity, though not necessarily, we presume they are numbers) and the output is the same set of objects, arranged in a different way (for example in increasing or decreasing order).
By definition, an algorithm is a deterministic procedure. Nevertheless, there exist randomized algorithms which make use of random numbers during their execution. In practice these random numbers are obtained with a pseudorandom number generator (Chapter 11), and the randomized algorithm together with the pseudorandom number generator again are a deterministic algorithm. Therefore, the difference between the two types of algorithms might seem only formal. In any case, the probabilistic analysis of a randomized algorithm assumes the random numbers to be completely uncorrelated (although the pseudorandom numbers, used in practice, may have some correlation).
The computational complexity theory (box on page 439) classifies computational problems based on the computing resources required to solve them. The following are the most common resources considered.
•The computation time (also called CPU time or machine time) is the number of elementary operations which need to be executed to solve the problem. This time does not coincide perfectly with the physical execution time because different elementary operations may have different execution times depending on the processor. Still, it is reasonable to assume that an algorithm requiring a smaller number of elementary operations finishes earlier than one requiring a larger number of operations.
•The memory required to store all data of the problem, which includes both the original data and the data generated during execution.
•The number of processors (nodes) in a parallel computer.
•The bandwidth (i.e., the amount of data that can be transmitted per time unit) of the communication channel between the memory and the processor in a single computer or between different nodes in a parallel computer.
In this textbook, we only consider the first two resources, as the other two depend on the computer architecture. For example, the communication bandwith might depend on the number of registers, cache levels and the communication protocol between the nodes of a parallel computer.
The complexity of a problem is inevitably connected to the complexity of the best algorithm solving that problem. However, the computational complexity is a formal theory classifying problems even before the best algorithm resolving them is known. Due to space limitations we do not expose the computational complexity theory in this textbook. Rather, we limit ourselves to the computational complexity analysis of algorithms (i.e., of their performance).
The computational resources required during the execution of an algorithm increase with the size of the input (sorting a million numbers is much more demanding than sorting a thousand numbers). For this reason the computational complexity theory classifies algorithms based on the rate with which the necessary resources increase with the size N of the input.
Let
T be the generic computational resource required by the algorithm. An algorithm is said to be linear or quadratic in the size of the input if, in the limit
N → ∞, we have that
1 T =
(
N) or
T =
(
N2). In practical applications, the multiplicative coefficient of the power of
N is also important. Suppose for example that we have two algorithms available, one with
T1 =
N2, and another with
T2 = 1000
N. Therefore, for small input sizes,
N < 1000, the first algorithm is more efficient. Nevertheless, the classification of algorithms is based on the asymptotic behavior when N tends to infinity. Therefore, from the theoretical viewpoint, the multiplicative constant can be ignored and the second algorithm is said to perform better.
In order to correctly apply the theory it is crucial to make a good estimate of the input size N. Indeed, depending on the problem under consideration, the input size N should be defined in different ways. For example, in a data sorting algorithm, N is the number of objects to be sorted. However, for an algorithm computing the factorization of a number, N is the number of bits required to represent this number. The rule that N is the number of bits required to encode the complete input may be enough to classify a problem as P or NP (see box on page 439). However, in order to compute the exact power determining the growth of an algorithm complexity, a more precise definition of N is needed. This becomes clearer by considering an example.
Consider a quadratic sorting algorithm, i.e., its execution requires a number of operations proportional to M2, where M is the number of input datapoints which need to be sorted. If all datapoints are composed of B bits, a single comparison operation between two elements takes a time which is proportional to B. Therefore, the total execution time is proportional to B · M2, which cannot be expressed as a function of the number of input bits, i.e., N = B · M, only. Therefore, with this specific algorithm, sorting 100 numbers of...