Geospatial Data Science Quick Start Guide
eBook - ePub

Geospatial Data Science Quick Start Guide

Effective techniques for performing smarter geospatial analysis using location intelligence

  1. 170 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Geospatial Data Science Quick Start Guide

Effective techniques for performing smarter geospatial analysis using location intelligence

About this book

Discover the power of location data to build effective, intelligent data models with Geospatial ecosystems

Key Features

  • Manipulate location-based data and create intelligent geospatial data models
  • Build effective location recommendation systems used by popular companies such as Uber
  • A hands-on guide to help you consume spatial data and parallelize GIS operations effectively

Book Description

Data scientists, who have access to vast data streams, are a bit myopic when it comes to intrinsic and extrinsic location-based data and are missing out on the intelligence it can provide to their models. This book demonstrates effective techniques for using the power of data science and geospatial intelligence to build effective, intelligent data models that make use of location-based data to give useful predictions and analyses.

This book begins with a quick overview of the fundamentals of location-based data and how techniques such as Exploratory Data Analysis can be applied to it. We then delve into spatial operations such as computing distances, areas, extents, centroids, buffer polygons, intersecting geometries, geocoding, and more, which adds additional context to location data. Moving ahead, you will learn how to quickly build and deploy a geo-fencing system using Python. Lastly, you will learn how to leverage geospatial analysis techniques in popular recommendation systems such as collaborative filtering and location-based recommendations, and more.

By the end of the book, you will be a rockstar when it comes to performing geospatial analysis with ease.

What you will learn

  • Learn how companies now use location data
  • Set up your Python environment and install Python geospatial packages
  • Visualize spatial data as graphs
  • Extract geometry from spatial data
  • Perform spatial regression from scratch
  • Build web applications which dynamically references geospatial data

Who this book is for

Data Scientists who would like to leverage location-based data and want to use location-based intelligence in their data models will find this book useful. This book is also for GIS developers who wish to incorporate data analysis in their projects. Knowledge of Python programming and some basic understanding of data analysis are all you need to get the most out of this book.

Trusted by 375,005 students

Access to over 1 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Year
2019
Print ISBN
9781789809411
Edition
1
eBook ISBN
9781789809336

Let's Build a Routing Engine

"Logic will take you from A to B. Imagination will take you everywhere."
- Albert Einstein
Despite the possibility of flying cars in the near future, right now, you still need to use the road or rail to get from point A to point B on land. I am pretty sure that you have never once deleted the Maps app from your smartphone. So, what makes the Maps app so indispensable that you can't imagine living in the pre-Google Maps era (that is, pre-2005)? Map-based routing saves you thousands of dollars in fuel costs and time spent in traffic (unless you're in the Bay Area, in which case, even Google Maps can't save you from the traffic!).
Good map-based routing is dependent on an accurate, well-defined, and updated graph network. Graph algorithms are treated as an advanced topic in computer science since few of the graph problems such as the traveling salesman problem (TSP) are considered NP-complete problems. But who says we have to flirt with the NP-completeness paradigm to build a simple routing engine? With open source road traffic data and public transit feeds, routing engines will no longer be a black box.
The following topics will be covered in this chapter:
  • Fundamentals of graph data structure
  • Shortest path analysis on a simple graph
  • Building a graph based on a road network
  • Shortest path analysis on the road network graph
We will be using Google Colab for this chapter. If you have a cloud machine where you've hosted a Jupyter Notebook, that works fine as well.

Fundamentals of graph data structure

Graphs can be effectively used to model and solve routing problems through road and public transit networks. Graphs can be designed to model and predict financial transactions and even complex social networks (yeah, blame a graph algorithm the next time Facebook or LinkedIn makes an unfamiliar or unsolicited friend suggestion or professional connection). Despite its versatility, the graph universe is made up of just two simple, easily relatable components, namely, nodes and edges. In a road network, a node might represent a road intersection and an edge might very well represent the road segment itself. The convention is that an edge is an entity that always connects two nodes, as is represented in the following diagram:
Simple edge
Let's fire up a new Google Colab Notebook and build our first graph using a Python library known as networkx. By default, the networkx library is installed in the Google Colab environment. If not, be sure to install it using pip or conda or a similar package manager. The following command should work in most Jupyter Notebook coding environments:
!pip install networkx
The networkx library provides a clean and efficient data structure to define and work with graphs. The simplicity of the networkx library is the most appealing factor for adopting this library for this chapter. Let's get started with networkx and create our first graph with just four lines of code:
import networkx as nx
G = nx.Graph()
G.add_edge('A','B')
G.add_edge('B','C')
Plotting this simple graph looks like the following diagram:
An elementary graph with nodes and edges
Just like that, we have defined a graph object, G, and we have added two edges to it. The first edge connects the 'A' and 'B' nodes, and the second edge connects nodes 'B' and 'C'.

Directional graphs

These edges in the preceding graph didn't have a direction. But edges can have a direction. If you think of the road network analogy, there are one-way roads, in which you can only drive along in one direction, but most roads are bidirectional:
One way roads
This can be modeled in networkx by instantiating a directional graph (digraph). In a digraph, the position of the nodes in the edge definition determines the direction of the edge. The convention is the first node in the edge definition is the from node or source node and the second node is the to node or target node or destination node. Let's instantiate a directional graph with the following code snippet:
H = nx.DiGraph()
H.add_edge('B', 'A')
H.add_edge('B','C')
Plotting the digraph, H, yields the following plot. Notice the arrows in the diagram indicating the direction of the edge:
A directed graph
The preceding lines of code mean that there are three nodes, 'A', 'B', and 'C', and there's a connection between B and A and not the other way around. It also means that there's a connection between B and C, but not between C to B. Digraphs are very important in modeling real-world networks, especially road networks.
If a road segment is bidirectional, you might have to add two different edges between the same nodes, as follows:
I = nx.DiGraph()
I.add_edge("A", "B")
I.add_edge("B", "A")
I.add_edge("A", "C")
In the following plot, notice the arrows. Edge AB has two arrows pointing in opposite directions:
A bidirectional graph

Weighted graphs

In the previous examples, each edge is considered to have a unit weight. What that means is that the cost of traveling from one node to another through an edge is the same. This need not always be the case. In the case of a road network, each road segment is different from each other in terms of length and time (taken to traverse it). So, if we are going to represent these road segments as edges, we need to make sure that the edges have different costs or weights.
The networkx library allows us to add weight to an edge, which is demonstrated in the code, as follows:
import networkx as nx

#Create a weighted graph
G=nx.DiGraph()
G.add_edge('A','B',weight=6)
G.add_edge('A','C',weight=2)
G.add_edge('C','D',weight=4.5)
G.add_edge('C','E',weight=5)
G.add_edge('C','F',weight=6)
G.add_edge('A','D',weight=3)
The output will be as follows:
A weighted graph with the width of edges representing the weight of the edge
In this section, we had a good introduction to the components and types of graph data structures. In the next section, we will look into the popular analyses commonly performed using graphs.

Shortest path analysis on a simple graph

Suppose you want to connect to Barack Obama through LinkedIn; how many degrees of connection do you have to go through to reach Obama? A first-degree connection is someone who is connected to you on LinkedIn. A second-degree connection is someone who is connected to your first-degree connection and so on. Assuming that each of your connections and their connections respectively are interested and ready to help you network with the Obama, research says that it only takes an average of 5 degrees of connection for you to connect with Obama, or anyone in the world, for that matter. In other words, the shortest path between any of us and Obama is less than or equal to five. That soun...

Table of contents

  1. Title Page
  2. Copyright and Credits
  3. Dedication
  4. About Packt
  5. Contributors
  6. Preface
  7. Introducing Location Intelligence
  8. Consuming Location Data Like a Data Scientist
  9. Performing Spatial Operations Like a Pro
  10. Making Sense of Humongous Location Datasets
  11. Nudging Check-Ins with Geofences
  12. Let's Build a Routing Engine
  13. Getting Location Recommender Systems
  14. Other Books You May Enjoy

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Geospatial Data Science Quick Start Guide by Abdishakur Hassan, Jayakrishnan Vijayaraghavan in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Mining. We have over one million books available in our catalogue for you to explore.