eBook - ePub

GIS Algorithms

Name: GIS Algorithms
ISBN: 9781473933460

Ningchuan Xiao,

Author,

336 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

GIS Algorithms

Ningchuan Xiao,

Author,

About this book

Geographic information systems (GIS) have become increasingly important in helping us understand complex social, economic, and natural dynamics where spatial components play a key role. The critical algorithms used in GIS, however, are notoriously difficult to both teach and understand, in part due to the lack of a coherent representation. GIS Algorithms attempts to address this problem by combining rigorous formal language with example case studies and student exercises.

Using Python code throughout, Xiao breaks the subject down into three fundamental areas:

Geometric Algorithms
Spatial Indexing
Spatial Analysis and Modelling

With its comprehensive coverage of the many algorithms involved, GIS Algorithms is a key new textbook in this complex and critical area of geography.

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

SAGE Publications Ltd

Year

2015

Print ISBN

9781446274330, 9781446274323

eBook ISBN

9781473933460

Edition

Topic

Technik & Maschinenbau

Subtopic

Bauingenieurwesen

1 Introduction

Algorithms¹ are designed to solve computational problems. In general, an algorithm is a process that contains a set of well designed steps for calculation. For example, to correctly calculate the sum of 18 and 19, one must know how to deal with the fact that 8 plus 9 is more than 10, though different cultures have different ways of processing that. Even for a simple problem like this, we expect that the steps used can help us get the answer quickly and correctly. There are many problems that are more difficult than the simple problem of addition, and solving these problems, again efficiently and correctly, requires more careful design of the computational steps.

In GIS development and applications, algorithms are important in almost every aspect. When we click on a map, for example, we expect a quick response from the computer system so that we can pull out relevant information about the point or area we just clicked on. Such a fundamental daily routine for almost every GIS application involves a variety of algorithms to ensure a satisfying response. It starts from searching for the object (point, line, polygon, or pixel) underneath the clicking point. An efficient search algorithm will allow us to narrow down to the area of interest quickly. While a brute-force approach may work by physically checking every object in our data, it will not be useful for a large data set that will make the time of finding the target impractically long. Many spatial indexing and query algorithms are designed to address this issue. While the search is ongoing, we must check whether the object in our data matches the point clicked. For polygons, we must decide whether the click point is within a polygon in our data, which requires a special algorithm to quickly return a yes or no answer to decide whether the point is in the polygon. Geospatial data normally come from many different sources, and it has been common practice to transform them into the same coordinate system so that different data sets can be processed consistently. Another common application of the multiple data sources is to overlay them to make the information more useful together.

There are many aspects of an algorithm to be examined. It is straightforward to require that an algorithm solve the problem correctly. For some algorithms, it is easy to prove their correctness. For example, we will introduce two search algorithms later in this chapter, and their correctness should be quite straightforward. Other algorithms, however, are not so obvious, and proving their correctness will require more formal analysis. A second feature of algorithms is their efficiency or running time. Of course we always want an algorithm to be fast, but there are theoretical limits on how fast or efficient an algorithm can be, as determined by the problem. We will discuss some of those problems at the end of the book under topics of spatial optimization. Besides correctness and running time, algorithms are often closely related to how the data are organized to enable the processes and how the algorithms are actually implemented.

1.1 Computational concerns for algorithms

Let us assume we have a list of n points and the list does not have any order. We want to find a point from the list. How long will we take to find the point? This is a reasonable question. But the actual time is highly related to a lot of issues such as the programming language, the skill of the person who codes the program, the platform, the speed and number of the CPUs, and so on. A more useful way to examine the time issue is to know how many steps we need to finish the job, and then we analyze the total cost of performing the algorithm in terms of the number of steps used. The cost of each step is of course variable and is dependent on what constitutes a step. Nevertheless, it is still a more reliable way of thinking about computing time because many computational steps, such as simple arithmetic operations, logical expressions, accessing computer memory for information retrieval, and variable value assignment, can be identified and they only cost a constant amount of time. If we can figure out a way to count how many steps are needed to carry out a procedure, we will then have a pretty good idea about how much time the entire procedure will cost, especially when we compare algorithms.

Returning to our list of points, if there is no structure in the list – the points are stored in an arbitrary order – the best we can do to find a point from the list is to test all the points in the list, one by one, until we can conclude it is in or not in the list. Let us assume the name of the list is points and we want to find if the list includes point p0. We can use a simple algorithm to do the search (Listing 1.1).

The algorithm in Listing 1.1 is called a linear search; in it we simply go through all the points, if necessary, to search for the information we need. How many steps are necessary in this algorithm? The first line is a loop and, because of the size of the list, it will run as many as n times when the item we are looking for happens to be the last one in the list. The cost of running just once in the loop part in line 1 is a constant because the list is stored in the computer memory, and the main operation steps here are to access the information at a fixed location in the memory and then to move on to the next item in the memory. Suppose that the cost is c₁ and we will run it up to to n times in the loop. The second line is a logic comparison between two points. It will run up to n times as well because it is inside the loop. Suppose that the cost of doing a logic comparison is c₂ and it is a constant too. Line 3 simply returns the value of the point found; it has a constant cost of c and it will only run once. For the best case scenario, we will find the target at the first iteration of the loop and therefore the total cost is simply c₁ + c₂ + c, which can be generalized as a constant b + c. In the worst case scenario, however, we will need to run all the way to the last item in the list and therefore the total cost becomes c₁n + c₂n + c, which can be generalized as bn + c, where b and c are constants, and n is the size of the list (also the size of the problem). On average, if the list is a random set of points and we are going to search for a random point many times, we should expect a cost of c₁n/2 + c₂n/2 + c, which can be generalized as b′n + c, and we know b′ < b, meaning that it will not cost as much as the worse case scenario does.

How much are we interested in the actual values of b, b′, and c in the above analysis? How will these values impact the total computation cost? As it turns out, not much, because they are constants. But adding them up many times will have a real impact and n, the problem size, generally controls how many times these constant costs will be added together. When n reaches a certain level, the impact of the constants will become minimal and it is really the magnitude of n that controls the growth of the total computation cost.

Some algorithms will have a cost related to n², which is significantly different from the cost of n. For example, the algorithm in Listing 1.2 is a simple procedure to compute the shortest pairwise distance between two points in the list of n points. Here, the first loop (line 2) will run n times at a cost of t₁ each, and the second loop (line 3) will run exactly n² times at the same cost of t₁ each. The logic comparison (line 4) will run n² times and we assume each time the cost is t₂. The calculation of distance (line 5) will definitely be more costly than the other simple operations such as logic comparison, but it is still a constant as the input is fixed (with two points) and only a small finite set of steps will be taken to carry out the calculation. We say the cost of each distance calculation is a constant t₃. Since we do not compute the distance between the point and itself, the distance calculation will run n²–n times, as will the comparison in line 6 (with time t₄). The assignment in line 7 will cost a constant time of t₅ and may run up to n²–n times in the worst case scenario where every next distance is shorter than the previous one. The last line will only run once with a time of c. Overall, the total time for this algorithm will be t₁n + t₁n² + t₂n² + t₃ (n²−n) + t₄ (n²−n) + t₅ (n² − n) + c, which can be generalized as an² + bn + c. Now it should be clear that this algorithm has a running time that is controlled by n².

In the two example algorithms we have examined so far, the order of n indicates the total cost and we say that our linear search algorithm has a computation cost in the order of n and the shortest pairwise distance algorithm in the order of n². For the linear search, we also know that, when n increases, the total cost of search will always have an upper bound of bn. But is there a lower bound? We know the best case scenario has a running time of a constant, or in the order of n⁰, but that does not apply to the general case. When we can definitely find an upper bound but not a lower bound of the running time, we use the O-notation to denote the order. In our case, we have O(n) for the average case, and the worst case scenario as well (because again the constants do not control the total cost). In other words, we say that the running time, or time complexity, of the linear search algorithm is O(n). Because the O-notation is about the upper bound, which is meant to be the worst case scenario, we also mean the time complexity of the worst case scenario.

There are algorithms for which we do not have the upper bound of their running time. But we know their lower bound and we use the Ω-notation to indicate that. A running time of Ω(n) would mean we know the algorithm will at least cost an order of n in its running time, though we do not know the upper bound of the running time. For other algorithms, we know both upper and lower bounds of the running time and we use the Θ-notation to indicate that. For example, a running time of Θ(n²) indicates that the algorithm will take an order of n² in running time in all cases, best and worst. This is the case for our shortest distance algorithm because the process will always run n² times, regardless of the outcome of the comparison in line 6. It is more accurate to say the time complexity is Θ(n²) instead of O(n²) because we know the lower bound of the running time of pairwise shortest distance is always in the order of n².

Now we reorganize our points in the previous list in a particular tree structure as illustrated in Figure 1.1. This is a binary tree because each node on the tree can have at most two branches, starting from the root. Here the root of the tree stores point (6, 7) and we show it at the top. All the points with X coordinates smaller than or equal to that at the root are stored in the left branches of the root and those with X coordinates greater than that of the root point are stored in the right branches of the root. Going down the tree to the second level, we have two points there, (4, 6) a...

Cover
Half Title
Acknowledgements
Title Page
Copyright Page
Acknowledgements
Contents
About the Author
Preface
1 Introduction
Part I Geometric Algorithms
2 Basic Geometric Operations
3 Polygon Overlay
Part II Spatial Indexing
4 Indexing
5 k-D Trees
6 Quadtrees
7 Indexing Lines and Polygons
Part III Spatial Analysis and Modeling
8 Interpolation
9 Spatial Pattern and Analysis
10 Network Analysis
11 Spatial Optimization
12 Heuristic Search Algorithms
Postscript
Appendix A Python: A Primer
Appendix B GDAL/OGR and PySAL
Appendix C Code List
References
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access GIS Algorithms by Ningchuan Xiao,Author in PDF and/or ePUB format, as well as other popular books in Technik & Maschinenbau & Bauingenieurwesen. We have over one million books available in our catalogue for you to explore.