Computer Science
Merge Sort
Merge Sort is a sorting algorithm that uses a divide-and-conquer approach to sort an array of elements. It divides the array into two halves, sorts them separately, and then merges them back together in a sorted manner. It has a time complexity of O(n log n) and is widely used in computer science.
Written by Perlego with AI-assistance
Related key terms
1 of 5
12 Key excerpts on "Merge Sort"
- eBook - PDF
- Michael T. Goodrich, Roberto Tamassia(Authors)
- 2014(Publication Date)
- Wiley(Publisher)
Thus, having a fast algorithm for sorting can be very helpful for a search engine, particularly if that algorithm is designed to work quickly for large sets of input data. We study two sorting algorithms in this chapter. The first algorithm, called merge-sort, is ideally suited for very large data sets, which must be accessed on a hard drive or network storage system. The second algorithm, called quick-sort, is very efficient for moderately large data sets that fit in the computer’s main memory (RAM). 8.1. Merge-Sort 243 8.1 Merge-Sort In this section, we present a sorting technique, called merge-sort, that can be de- scribed in a simple and compact way using recursion. 8.1.1 Divide-and-Conquer Merge-sort is based on an algorithmic paradigm called divide-and-conquer. The divide-and-conquer paradigm can be described in general terms as consisting of the following three steps (see Figure 8.1): 1. Divide: If the input size is smaller than a certain threshold (say, 10 elements), solve the problem directly using a straightforward method and return the so- lution so obtained. Otherwise, divide the input data into two or more disjoint subsets. 2. Recur: Recursively solve the subproblems associated with the subsets. 3. Conquer: Take the solutions to the subproblems and “merge” them into a solution to the original problem. Split list equally S 1 S 2 2. Recur. 3. Merge. 1. Divide in half . 2. Recur. Figure 8.1: A visual schematic of the divide-and-conquer paradigm, applied to a problem involving a list that is divided equally in two in the divide step. Merge-sort applies the divide-and-conquer technique to the sorting problem, where, for the sake of generality, let us consider the sorting problem to take a sequence, S , of objects as input, which could be represented with either a list or an array, and returns S in sorted order. - eBook - PDF
- Michael T. Goodrich, Roberto Tamassia, David M. Mount(Authors)
- 2011(Publication Date)
- Wiley(Publisher)
500 Chapter 11. Sorting, Sets, and Selection 11.1 Merge-Sort In this section, we present a sorting technique, called merge-sort, which can be described in a simple and compact way using recursion. 11.1.1 Divide-and-Conquer Merge-sort is based on an algorithmic design pattern called divide-and-conquer. The divide-and-conquer pattern consists of the following three steps: 1. Divide: If the input size is smaller than a certain threshold (say, one or two elements), solve the problem directly using a straightforward method and return the solution obtained. Otherwise, divide the input data into two or more disjoint subsets. 2. Recur: Recursively solve the subproblems associated with the subsets. 3. Conquer: Take the solutions to the subproblems and “merge” them into a solution to the original problem. Using Divide-and-Conquer for Sorting Recall that in a sorting problem we are given a sequence of n objects, stored in a linked list or an array, together with some comparator defining a total order on these objects, and we are asked to produce an ordered representation of these objects. To allow for sorting of either representation, we describe our sorting algorithm at a high level for sequences and explain the details needed to implement it for linked lists and arrays. To sort a sequence S with n elements using the three divide-and- conquer steps, the merge-sort algorithm proceeds as follows: 1. Divide: If S has zero or one element, return S immediately; it is already sorted. Otherwise (S has at least two elements), remove all the elements from S and put them into two sequences, S 1 and S 2 , each containing about half of the elements of S; that is, S 1 contains the first ⌈n/2⌉ elements of S, and S 2 contains the remaining ⌊n/2⌋ elements. 2. Recur: Recursively sort sequences S 1 and S 2 . 3. Conquer: Put back the elements into S by merging the sorted sequences S 1 and S 2 into a sorted sequence. - eBook - PDF
Explorations in Computing
An Introduction to Computer Science and Python Programming
- John S. Conery(Author)
- 2014(Publication Date)
- Chapman and Hall/CRC(Publisher)
Merge Sort starts by repeatedly merging small groups, each of which is already sorted, into larger groups. The fact that the small groups are sorted makes the merge an efficient operation, as the comparisons only need to be made to the items at the start of each group. By the time Merge Sort has finished, it will have made at most n × log 2 n comparisons to sort a list with n items. Scalability, a concept introduced in the previous chapter, is the motivation for studying binary search and Merge Sort. For small lists containing only a few dozen items, linear search and insertion sort are perfectly adequate. In fact, because they are simpler to implement, they may even run faster. But as lists become longer, the more sophisticated algorithms will be much more efficient and do far fewer comparisons. Binary search is a natural way to solve the problem of finding an item in an ordered col-lection, and it is something people have been doing intuitively for many years, probably since the first dictionaries were published. But Merge Sort and quicksort were invented by computer scientists who analyzed the problem of sorting lists of numbers and were moti-vated to find more efficient algorithms that would work on large lists. Merge Sort was first described in 1945, by John von Neumann, and quicksort in 1960, by C.A.R. Hoare. This raises an interesting question: if Merge Sort is such an effective algorithm for com-puters, wouldn’t it also be an efficient way for people to sort real objects? If you drop your box of recipes, and you need to sort several hundred cards, would a Merge Sort be more effective than a more intuitive insertion sort as a way to reorganize the cards so you can put them back in the box? If you would like to pursue this question further, Exercises 5.13 and 5.14 below have some suggestions for how to organize a Merge Sort using real cards. - Akhilesh Kumar Srivastava(Author)
- 2019(Publication Date)
- Arcler Press(Publisher)
6.5. Merge Sort • Merge Sort works on the principle of Divide and conquer. • It keeps dividing the given array into two parts from very mid until we get an array of 1 element. Sorting Algorithms 75 • Array of 1 element is supposed to be sorted. • The two sorted arrays are merged to get the larger sorted array. • The parts which are segregated from one can only merge. • This is a recursive procedure. To understand the Merge operation, first, try to understand the Merging of two sorted arrays. Merging two sorted arrays ALGORITHM: MergeArray(A[ ], m, B[ ], n) // m is the size of A[ ] and n is the size of B[ ] array BEGIN: C[m+n] i ← 1, j ← 1, k ← 1 WHILE i <= m AND j <= n DO //Comparing the elements of A[ ] and B[ ] arrays IF A[i]- eBook - PDF
- Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser(Authors)
- 2013(Publication Date)
- Wiley(Publisher)
With that said, it remains important to have a deep understanding of sorting algorithms. Most immediately, when calling the built-in function, it is good to know what to expect in terms of efficiency and how that may depend upon the initial order of elements or the type of objects that are being sorted. More generally, the ideas and approaches that have led to advances in the development of sorting algorithm carry over to algorithm development in many other areas of computing. We have introduced several sorting algorithms already in this book: • Insertion-sort (see Sections 5.5.2, 7.5, and 9.4.1) • Selection-sort (see Section 9.4.1) • Bubble-sort (see Exercise C-7.38) • Heap-sort (see Section 9.4.2) In this chapter, we present four other sorting algorithms, called merge-sort, quick-sort, bucket-sort, and radix-sort, and then discuss the advantages and disad- vantages of the various algorithms in Section 12.5. 1 In Section 12.6.1, we will explore another technique used in Python for sorting data according to an order other than the natural order defined by the < operator. 538 Chapter 12. Sorting and Selection 12.2 Merge-Sort 12.2.1 Divide-and-Conquer The first two algorithms we describe in this chapter, merge-sort and quick-sort, use recursion in an algorithmic design pattern called divide-and-conquer. We have already seen the power of recursion in describing algorithms in an elegant manner (see Chapter 4). The divide-and-conquer pattern consists of the following three steps: 1. Divide: If the input size is smaller than a certain threshold (say, one or two elements), solve the problem directly using a straightforward method and return the solution so obtained. Otherwise, divide the input data into two or more disjoint subsets. 2. Conquer: Recursively solve the subproblems associated with the subsets. 3. Combine: Take the solutions to the subproblems and merge them into a so- lution to the original problem. - eBook - PDF
- Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser(Authors)
- 2014(Publication Date)
- Wiley(Publisher)
Even so, merge-sort is an excellent algorithm for situations where the input is stratified across various levels of the computer’s memory hierarchy (e.g., cache, main memory, external memory). In these contexts, the way that merge-sort processes runs of data in long merge streams makes the best use of all the data brought as a block into a level of memory, thereby reducing the total number of memory transfers. The GNU sorting utility (and most current versions of the Linux operating sys- tem) relies on a multiway merge-sort variant. Tim-sort (designed by Tim Peters) is a hybrid approach that is essentially a bottom-up merge-sort that takes advantage of initial runs in the data while using insertion-sort to build additional runs. Tim-sort has been the standard sorting algorithm in Python since 2003, and it has become the default algorithm for sorting arrays of object types, as of Java SE 7. Bucket-Sort and Radix-Sort Finally, if an application involves sorting entries with small integer keys, character strings, or d -tuples of keys from a discrete range, then bucket-sort or radix-sort is an excellent choice, for it runs in O(d (n + N )) time, where [0, N − 1] is the range of integer keys (and d = 1 for bucket sort). Thus, if d (n + N ) is significantly “below” the n log n function, then this sorting method should run faster than even quick-sort, heap-sort, or merge-sort. 13.5. Selection 563 13.5 Selection As important as it is, sorting is not the only interesting problem dealing with a total order relation on a set of elements. There are a number of applications in which we are interested in identifying a single element in terms of its rank relative to the sorted order of the entire set. Examples include identifying the minimum and maximum elements, but we may also be interested in, say, identifying the median element, that is, the element such that half of the other elements are smaller and the remaining half are larger. - eBook - PDF
- Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser(Authors)
- 2014(Publication Date)
- Wiley(Publisher)
Even so, merge-sort is an excellent algorithm for situations where the input is stratified across various levels of the computer’s memory hierarchy (e.g., cache, main memory, external memory). In these contexts, the way that merge-sort processes runs of data in long merge streams makes the best use of all the data brought as a block into a level of memory, thereby reducing the total number of memory transfers. The GNU sorting utility (and most current versions of the Linux operating sys- tem) relies on a multiway merge-sort variant. Tim-sort (designed by Tim Peters) is a hybrid approach that is essentially a bottom-up merge-sort that takes advantage of initial runs in the data while using insertion-sort to build additional runs. Tim-sort has been the standard sorting algorithm in Python since 2003, and it has become the default algorithm for sorting arrays of object types, as of Java SE 7. Bucket-Sort and Radix-Sort Finally, if an application involves sorting entries with small integer keys, character strings, or d -tuples of keys from a discrete range, then bucket-sort or radix-sort is an excellent choice, for it runs in O(d (n + N )) time, where [0, N − 1] is the range of integer keys (and d = 1 for bucket sort). Thus, if d (n + N ) is significantly “below” the n log n function, then this sorting method should run faster than even quick-sort, heap-sort, or merge-sort. 12.5. Selection 563 12.5 Selection As important as it is, sorting is not the only interesting problem dealing with a total order relation on a set of elements. There are a number of applications in which we are interested in identifying a single element in terms of its rank relative to the sorted order of the entire set. Examples include identifying the minimum and maximum elements, but we may also be interested in, say, identifying the median element, that is, the element such that half of the other elements are smaller and the remaining half are larger. - Pick up one card at a time and insert it so that the cards stay sorted. Insertion sort is an O(n 2 ) algorithm. 402 Chapter 12 Sorting and Searching 12.4 Merge Sort In this section, you will learn about the Merge Sort algorithm, a much more efficient algorithm than selection sort. The basic idea behind Merge Sort is very simple. Sup- pose you have an array of 10 integers. Engage in a bit of wishful thinking and hope that the first half of the array is already perfectly sorted, and the second half is too, like this: 5 9 10 12 17 1 8 11 20 32 Now it is an easy matter to merge the two sorted arrays into a sorted array, simply by taking a new element from either the first or the second subarray and choosing the smaller of the elements each time: 5 9 10 12 17 1 8 11 20 32 1 5 9 10 12 17 1 8 11 20 32 1 5 5 9 10 12 17 1 8 11 20 32 1 5 8 5 9 10 12 17 1 8 11 20 32 1 5 8 9 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 11 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 11 12 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 11 12 17 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 11 12 17 20 5 9 10 12 17 1 8 11 20 32 1 5 8 9 10 11 12 17 20 32 In fact, you probably performed this merging before when you and a friend had to sort a pile of papers. You and the friend split the pile in half, each of you sorted your half, and then you merged the results together. This is all well and good, but it doesn’t seem to solve the problem for the computer. It still has to sort the first and sec- ond halves, because it can’t very well ask a few buddies to pitch in. As it turns out, though, if the computer keeps divid- ing the array into smaller and smaller subarrays, sorting each half and merging them back together, it carries out dramati- cally fewer steps than the selection sort requires.
- eBook - PDF
C++ Programming
Program Design Including Data Structures
- D. Malik(Author)
- 2017(Publication Date)
- Cengage Learning EMEA(Publisher)
All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-300 Merge Sort: Linked List–Based Lists | 1319 1 8 From Figure 18-30, it is clear that in the Merge Sort algorithm, most of the sorting work is done in merging the sorted sublists. The general algorithm for the Merge Sort is as follows: if the list is of a size greater than 1 { a. Divide the list into two sublists. b. Merge Sort the first sublist. c. Merge Sort the second sublist. d. Merge the first sublist and the second sublist. } As remarked previously, after dividing the list into two sublists—the first sublist and the second sublist—the two sublists are sorted using the Merge Sort algorithm. In other words, we use recursion to implement the Merge Sort algorithm. 35 28 18 45 62 48 30 38 divide 35 28 18 45 62 48 30 38 divide 35 28 18 45 divide 62 48 30 38 divide 35 28 divide 18 45 divide 62 48 divide 30 38 merge 28 35 merge 18 45 merge 48 62 merge 30 38 merge 18 28 35 45 merge 30 38 48 62 merge 18 28 30 35 38 45 48 62 FIGURE 18-30 Merge Sort algorithm Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-300 1320 | Chapter 18: Searching and Sorting Algorithms We next describe the necessary algorithm to ? Divide the list into two sublists of nearly equal size. ? Merge Sort both sublists. ? Merge the sorted sublists. Divide Because data is stored in a linked list, we do not know the length of the list. Further-more, a linked list is not a random access data structure. Therefore, to divide the list into two sublists, we need to find the middle node of the list. Consider the list in Figure 18-31. head 65 18 85 95 25 20 45 75 30 FIGURE 18-31 Unsorted linked list head middle current 65 18 85 95 25 20 45 75 30 FIGURE 18-32 middle and current before traversing the list To find the middle of the list, we traverse the list with two pointers—say, middle and current . - eBook - PDF
- Cay S. Horstmann, Rance D. Necaise(Authors)
- 2019(Publication Date)
- Wiley(Publisher)
To establish the growth order, you drop the lower order term n and are left with 5n log 2 (n). Drop the constant factor 5. It is also customary to drop the base of the loga- rithm because all logarithms are related by a constant factor. For example, log 2 (x) = log 10 (x) / log 10 (2) ≈ log 10 (x) × 3.32193 Hence we say that Merge Sort is an O(n log(n)) algorithm. 538 Chapter 12 Sorting and Searching Is the O(n log(n)) Merge Sort algorithm better than an O(n 2 ) selection sort algo- rithm? You bet it is. Recall that it took 100 2 = 10,000 times as long to sort one million records as it took to sort 10,000 records with the O(n 2 ) algorithm. With the O(n log(n)) algorithm, the ratio is 1 000 000 1 000 000 10 000 10 000 10 , , log , , , log , ( ) ( ) = 0 6 4 150 ⎛ ⎝ ⎜ ⎞ ⎠ ⎟ = Suppose for the moment that Merge Sort takes the same time as selection sort to sort a list of 10,000 integers, that is, about 9 seconds on the author’s test machine. (Actu- ally, it is much faster than that.) Then it would take about 9 × 150 seconds, or about 23 minutes, to sort a million integers. Contrast that with selection sort, which would take over a day for the same task. As you can see, even if it takes you several hours to learn about a better algorithm, that can be time well spent. In this chapter we have barely begun to scratch the surface of this interesting topic. There are many sorting algorithms, some with even better performance than Merge Sort, and the analysis of these algorithms can be quite challenging. These important issues are often revisited in later computer science courses. EXAMPLE CODE See sec05/mergetimer.py in your eText or companion code for a program that times the Merge Sort algorithm. Special Topic 12.3 The Quicksort Algorithm Quicksort is a commonly used algorithm that has the advantage over Merge Sort that no tem- porary lists are required to sort and merge the partial results. - eBook - PDF
- Pat Morin(Author)
- 2013(Publication Date)
- AU Press(Publisher)
two, so that n = 2 log n , and log n is an integer. Refer to Figure 11.2 . Merge-sort turns the problem of sorting n elements into two problems, each of sorting n / 2 elements. These two subproblem are then turned into two problems each, for a total of four subproblems, each of size n / 4. These four subproblems become eight subproblems, each of size n / 8, and so on. At the bottom of this process, n / 2 subproblems, each of size two, are converted into n problems, each of size one. For each subproblem of size n / 2 i , the time spent merging and copying data is O ( n / 2 i ). Since there are 2 i subproblems of size n / 2 i , the total time spent working on problems of size 2 i , not counting recursive calls, is 2 i × O ( n / 2 i ) = O ( n ) . Therefore, the total amount of time taken by merge-sort is log n X i =0 O ( n ) = O ( n log n ) . The proof of the following theorem is based on preceding analysis, but has to be a little more careful to deal with the cases where n is not a power of 2. Theorem 11.1. The mergeSort ( a , c ) algorithm runs in O ( n log n ) time and performs at most n log n comparisons. 228 Comparison-Based Sorting §11.1 Proof. The proof is by induction on n . The base case, in which n = 1, is trivial; when presented with an array of length 0 or 1 the algorithm simply returns without performing any comparisons. Merging two sorted lists of total length n requires at most n -1 compar-isons. Let C ( n ) denote the maximum number of comparisons performed by mergeSort ( a , c ) on an array a of length n . If n is even, then we apply the inductive hypothesis to the two subproblems and obtain C ( n ) ≤ n -1 + 2 C ( n / 2) ≤ n -1 + 2(( n / 2) log( n / 2)) = n -1 + n log( n / 2) = n -1 + n log n -n < n log n . The case where n is odd is slightly more complicated. For this case, we use two inequalities that are easy to verify: log( x + 1) ≤ log( x ) + 1 , (11.1) for all x ≥ 1 and log( x + 1 / 2) + log( x -1 / 2) ≤ 2 log( x ) , (11.2) for all x ≥ 1 / 2. - eBook - PDF
- Nikhat Raza Khan, Manmohan Singh, Piyush Kumar Shukla(Authors)
- 2023(Publication Date)
- Arcler Press(Publisher)
Whichever is smaller is moved to the single list being assembled. There is then a new first item in the list from which the item was moved, and the process repeats. The process overall is thus: a. Split the original list into two halves; b. Sort each half (using Merge Sort); c. Merge the two sorted halves together into a single sorted list. Example: The following example performs a Merge Sort on an array of integers: 34 56 78 12 45 3 99 23 34 56 78 12 45 3 99 23 34 56 78 12 45 3 99 23 34 56 7812 45 3 99 23 34 56 12 78 3 45 23 99 12 34 56 78 3 23 45 99 3 12 23 34 45 56 78 99 A. Algorithm: The steps to be taken for Merge Sort are as follows: a. [Check the size of the array] If n=1 then return Data Structure and Algorithm 184 b. [Otherwise check the first and last integer] If(first!=last)then repeat step (c) to (f) c. [Find the middle integer] middle = ((last + first)/2) d. [Recursively sort the left half array] MergeSort(array,first,middle) e. [Recursively sort the right half array] MergeSort(array,middle+1,last) f. [Merge two ordered sub-arrays] Merge(array,first,middle,last) g. End. B. Equivalent C Function: void MergeSort(int array[],int i,int n) { int mid,first,last; first=i; last=n; if(first!=last) { mid = (first+last)/2; MergeSort(array,first,mid); MergeSort(array,mid+1,last); Merge(array,first,mid,last); } } void Merge(int array[],int first,int n,int last) { int temp[100]; int f = first; int w = n+1; int l = last; int upper; Sorting Algorithms 185 while((f<=n)&&(w<=last)) { if(array[f]<=array[w]) { temp[l]=array[f]; f++; } else { temp[l]=array[w]; w++; } l++; } if(f<=n) { for(f=f;f<=n;f++) { temp[l] = array[f]; l++; } } else { for(w=w;w<=last;w++) { temp[l]=array[w]; l++; } } for(upper=first;upper<=last;upper++) // copy result back Data Structure and Algorithm 186 array[upper]=temp[upper]; } C.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.











