eBook - ePub

Visualizing Graph Data

Name: Visualizing Graph Data
ISBN: 9781638352488

Corey Lanum,

232 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Visualizing Graph Data

Corey Lanum,

About this book

Summary Visualizing Graph Data teaches you not only how to build graph data structures, but also how to create your own dynamic and interactive visualizations using a variety of tools. This book is loaded with fascinating examples and case studies to show you the real-world value of graph visualizations. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Assume you are doing a great job collecting data about your customers and products. Are you able to turn your rich data into important insight? Complex relationships in large data sets can be difficult to recognize. Visualizing these connections as graphs makes it possible to see the patterns, so you can find meaning in an otherwise over-whelming sea of facts. About the Book Visualizing Graph Data teaches you how to understand graph data, build graph data structures, and create meaningful visualizations. This engaging book gently introduces graph data visualization through fascinating examples and compelling case studies. You'll discover simple, but effective, techniques to model your data, handle big data, and depict temporal and spatial data. By the end, you'll have a conceptual foundation as well as the practical skills to explore your own data with confidence. What's Inside

Techniques for creating effective visualizations
Examples using the Gephi and KeyLines visualization packages
Real-world case studies

About the Reader No prior experience with graph data is required. About the Author Corey Lanum has decades of experience building visualization and analysis applications for companies and government agencies around the globe. Table of Contents

PART 1 - GRAPH VISUALIZATION BASICS

Getting to know graph visualization
Case studies
An introduction to Gephi and KeyLines

PART 2 VISUALIZE YOUR OWN DATA

Data modeling
How to build graph visualizations
Creating interactive visualizations
How to organize a chart
Big data: using graphs when there's too much data
Dynamic graphs: how to show data over time
Graphs on maps: the where of graph visualization

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Visualizing Graph Data by Corey Lanum in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Subtopic

Index

Part 1. Graph visualization basics

In part one of this book, we’ll take a high-level view of graphs. First, I’ll introduce you to what graphs are and how they can be used across a variety of domains, with some detailed case studies. Then, we’ll dive a little deeper into graph models of data, how they might be different from standard relational models of data, and how you can create graph data models from your data. I’ll introduce you to the two tools that we’ll use throughout the book: Gephi and KeyLines. I’ll use both Gephi and KeyLines in later chapters to illustrate how you can create graph visualizations of your own—for you own use, with Gephi, or as part of a visualization application, using KeyLines.

Chapter 1. Getting to know graph visualization

This chapter covers

Getting to know graphs as data models
Why graphs are a useful way to think about data
When to visualize graphs, and the node-link drawing concept
Other visualizations of graph data and when they’re useful

In December 2001, the Enron Corporation filed for what was at the time the largest ever corporate bankruptcy. Its stock had fallen from a high of $90 per share the previous year to $0.61, decimating its employees’ pensions and shareholders’ investments in it. The FBI’s investigation into this collapse became the largest white-collar criminal investigation in history as they seized over 3,000 boxes of documents and 4 terabytes of data. Among the information seized were about 600,000 emails between key executives at the organization. Although the FBI took pains to read every email individually, the investigators recognized that they were unlikely to find a smoking gun—people committing complex financial fraud seldom disclose their actions in written form. And in 2001, emails were only starting to become the primary means of internal communications; lots of information was still exchanged via phone calls.

In addition to looking at the text of individual emails, the FBI also wanted to uncover patterns in the communications, perhaps in an attempt to better understand who the decision makers were within Enron or who had access to a lot of the information internal to the company. To do this, they modeled the Enron emails as a graph.

A graph is a model of data that consists of nodes, which are discrete data elements (such as people), and edges, which are relationships between nodes. The graph model brings to the forefront relationships that may be hidden in tabular views of the same data and illustrates what is most important. By making those relationships between the data elements a core part of the data structure, you can identify patterns in the data that wouldn’t otherwise be apparent. But building graph data structures is only half the solution to pattern recognition. This book will teach you how to visualize graphs using interactive node-link visualization diagrams, and by the end, you’ll be able to create your own dynamic, interactive visualizations using a variety of tools available today.

In this chapter, I’ll go a little deeper into the concept of a graph and graph history and uses, and talk about various techniques used to visualize graph data. Subsequent chapters build on this framework by introducing concrete examples of graph visualizations and the data they’re based on and discuss various techniques for creating useful visualizations.

1.1. Getting to know graphs

Graphs are everywhere. As long as you’re interested in how items can be related to each other, there’s a graph somewhere in your data. In this section, I’ll walk you through what a graph is and what can be gained from visualizing graphs.

1.1.1. What is a graph?

As described previously, a graph—also called a network—is a set of interconnected data elements that’s expressed as a series of nodes and edges.

In the common definition of a graph, edges have exactly two endpoints, no more. In some cases, those two endpoints can be the same node if a node links to itself. An edge (also known as a link) can take one of two forms:

Directed— The relationship has a direction. Stella owns the car, but it doesn’t make sense to say the car owns Stella.
Undirected— The two items are linked without the concept of direction; the relationship inherently goes both ways. If Stella is linked to Roger because they committed a crime together, it means the same thing to say Stella was arrested with Roger as it does to say Roger was arrested with Stella.

In figure 1.1, you see an example of a directed link with properties.

Figure 1.1. A property graph of a single email between Enron executives. The two nodes are the sender and recipient of the email, and the directed edge is the email.

Both nodes and edges can have properties, which are key-value pairs—lists of properties and values, describing either the data element itself or the relationship. Figure 1.2 is a simple property graph showing that Stella bought a 2008 Volkswagen Jetta in September 2007 and sold it in October 2013. Modeling it as a graph highlights that Stella had a relationship with this car, albeit temporarily.

Figure 1.2. A simple property graph with two nodes and an edge. Stella (the first node) bought a 2008 Volkswagen Jetta (the second node) in September 2007 and sold it in October 2013. Modeling it as a graph highlights that Stella had a relationship with this car (the edge).

An email is a relationship, too, between the sender and the recipient. The properties of the nodes are things like email address, name, and title, and the properties of the relationship are the date/time it was sent, its subject line, and the text of the email.

To prove conspiracy, the FBI was interested in all the emails sent among the Enron executives, not just a single one, so let’s add some more nodes to represent a larger number of emails sent during a specified period of time, as shown in figure 1.3.

Figure 1.3. A graph of some of the Enron executives’ email communications. You can easily see that Timothy Belden is a hub of communication in this segment of Enron, sending and receiving email from many other executives.

Figure 1.3 is a directed graph because it matters whether Kevin Presto sent an email to Timothy Belden or received one—there’s a big difference between sending and receiving information when you’re investigating who knew what when. The arrowheads on the edges show that directionality: Kevin Presto sent an email to Timothy Belden, but Timothy Belden didn’t reply, indicating they may not have been close associates or they may have spoken offline. As we start to add more data to the graph, you can see the value of graphs—patterns become apparent. In this example, we can easily see that Timothy Belden is a hub of communication in this segment of Enron, sending and receiving email from many other executives.

1.1.2. A bit of theory

Graph theory began early in the eighteenth century with the Seven Bridges of Königsberg problem. In Königsberg, Prussia (now Kaliningrad, Russia), it was a common parlor game to try to determine a route that would allow someone to pass over all seven bridges over the Pregel River exactly once without passing over any bridge twice. (Go ahead and give it a shot using the map of the city, shown in figure 1.4, and see if you can prove three centuries of mathematicians wrong.)

Figure 1.4. The Seven Bridges of Königsberg problem. Using this map of the bridges of Königsberg, Prussia, try to draw a route that reaches each area of the city but never crosses the same bridge twice.

Leonhard Euler proved this problem unsolvable by abstracting the regions of the city into individual points and the bridges as paths between those points, as you can see in figure 1.5.

Figure 1.5. Seven bridges and four land areas of Königsberg as a graph. In this graph, nodes denote the land masses bordering the Pregel River and the two islands in its middle. Edges represent the bridges connecting the two islands and two shorelines.

E...

Copyright
Brief Table of Contents
Table of Contents
Preface
Acknowledgments
About this Book
About the Author
About the Cover Illustration
Part 1. Graph visualization basics
Part 2. Visualize your own data
Appendix. A tutorial on D3.js
Index
List of Figures
List of Tables
List of Listings