Graph-Powered Machine Learning
eBook - ePub

Graph-Powered Machine Learning

Alessandro Negro

Condividi libro
  1. 496 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Graph-Powered Machine Learning

Alessandro Negro

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Upgrade your machine learning models with graph-based algorithms, the perfect structure for complex and interlinked data. Summary
In Graph-Powered Machine Learning, you will learn: The lifecycle of a machine learning project
Graphs in big data platforms
Data source modeling using graphs
Graph-based natural language processing, recommendations, and fraud detection techniques
Graph algorithms
Working with Neo4J Graph-Powered Machine Learning teaches to use graph-based algorithms and data organization strategies to develop superior machine learning applications. You'll dive into the role of graphs in machine learning and big data platforms, and take an in-depth look at data source modeling, algorithm design, recommendations, and fraud detection. Explore end-to-end projects that illustrate architectures and help you optimize with best design practices. Author Alessandro Negro's extensive experience shines through in every chapter, as you learn from examples and concrete scenarios based on his work with real clients! Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology
Identifying relationships is the foundation of machine learning. By recognizing and analyzing the connections in your data, graph-centric algorithms like K-nearest neighbor or PageRank radically improve the effectiveness of ML applications. Graph-based machine learning techniques offer a powerful new perspective for machine learning in social networking, fraud detection, natural language processing, and recommendation systems. About the book
Graph-Powered Machine Learning teaches you how to exploit the natural relationships in structured and unstructured datasets using graph-oriented machine learning algorithms and tools. In this authoritative book, you'll master the architectures and design practices of graphs, and avoid common pitfalls. Author Alessandro Negro explores examples from real-world applications that connect GraphML concepts to real world tasks. What's inside Graphs in big data platforms
Recommendations, natural language processing, fraud detection
Graph algorithms
Working with the Neo4J graph databaseAbout the reader
For readers comfortable with machine learning basics. About the author
Alessandro Negro is Chief Scientist at GraphAware. He has been a speaker at many conferences, and holds a PhD in Computer Science.Table of Contents
PART 1 INTRODUCTION
1 Machine learning and graphs: An introduction
2 Graph data engineering
3 Graphs in machine learning applications
PART 2 RECOMMENDATIONS
4 Content-based recommendations
5 Collaborative filtering
6 Session-based recommendations
7 Context-aware and hybrid recommendations
PART 3 FIGHTING FRAUD
8 Basic approaches to graph-powered fraud detection
9 Proximity-based algorithms
10 Social network analysis against fraud
PART 4 TAMING TEXT WITH GRAPHS
11 Graph-based natural language processing
12 Knowledge graphs

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Graph-Powered Machine Learning è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Graph-Powered Machine Learning di Alessandro Negro in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Computer Science e Neural Networks. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Editore
Manning
Anno
2021
ISBN
9781638353935

Part 1 Introduction

We are surrounded by graphs. Facebook, LinkedIn, and Twitter are the most famous examples of social networks—that is, graphs of people. Other types of graphs exist even though we don’t think of them as such: electrical or power networks, the tube, and so on.
Graphs are powerful structures useful not only for representing connected information, but also for supporting multiple types of analysis. Their simple data model, consisting of two basic concepts such as nodes and relationships, is flexible enough to store complex information. If you also store properties in nodes and relationships, it is possible to represent practically everything of any size.
Furthermore, in a graph every single node and every single relationship is an access point for analysis, and from an access point, it is possible to navigate the rest in an endless way, which provides multiple access patterns and analysis potentials.
Machine learning, on the other side, provides tools and techniques for making representations of reality and providing predictions. Recommendation is a good example; the algorithm takes what the users interacted with and is capable of predicting what they will be interested in. Fraud detection is another one, taking the previous transactions (legit or not) and creating a model that can recognize with a good approximation whether a new transaction is fraudulent.
The performance of machine learning algorithms, both in terms of accuracy and speed, is affected almost directly from the way in which we represent our training data and store our prediction model. The quality of algorithm prediction is as good as the quality of the training dataset. Data cleansing and feature selection, among other tasks, are mandatory if we would like to achieve a reasonable level of trust in the prediction. The speed at which the system provides prediction affects the usability of the entire product. Suppose that a recommendation algorithm for an online retailer produced recommendations in 3 minutes. By that time, the user would be on another page or, worse, on a competitor’s website.
Graphs can support machine learning by doing what they do best: representing data in a way that is easily understandable and easily accessible. Graphs make all the necessary processes faster, more accurate, and much more effective. Moreover, graph algorithms are powerful tools for machine learning practitioners. Graph community detection algorithms can help identify groups of people, page rank can reveal the most relevant keywords in a text, and so on.
If you didn’t fully understand some of the terms and concepts presented in the introduction, the first part of the book will provide you all the knowledge you need to move further in the book. It introduces the basic concepts related to graphs and machine learning as single, independent entities and as powerful binomials. Let me wish you good reading!

1 Machine learning and graphs: An introduction

This chapter covers
  • An introduction to machine learning
  • An introduction to graphs
  • The role of graphs in machine learning applications
Machine learning is a core branch of artificial intelligence: it is the field of study in computer science that allows computer programs to learn from data. The term was coined in 1959, when Arthur Samuel, an IBM computer scientist, wrote the first computer program to play checkers [Samuel, 1959]. He had a clear idea in mind:
Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.
Samuel wrote his initial program by assigning a score to each board position based on a fixed formula. This program worked quite well, but in a second approach, he had the program execute thousands of games against itself and used the results to refine the board scoring. Eventually, the program reached the proficiency of a human player, and machine learning took its first steps.
An entity—such as a person, an animal, an algorithm, or a generic computer agent1—is learning if, after making observations about the world, it is able to improve its performance on future tasks. In other words, learning is the process of converting experience to expertise or knowledge [Shalev-Shwartz and Ben-David, 2014]. Learning algorithms use training data that represents experience as input and create expertise as output. That output can be a computer program, a complex predictive model, or tuning of internal variables. The definition of performance depends on the specific algorithm or goal to be achieved; in general, we consider it to be the extent to which the prediction matches specific needs.
Let’s describe the learning process with an example. Consider the implementation of a spam filter for emails. A pure programming solution would be to write a program to memorize all the emails labeled as spam by a human user. When a new email arrives, the pseudoagent will search for a similar match in the previous spam emails, and if it finds any matches, the new email will be rerouted to the trash folder. Otherwise, the email will pass through the filter untouched.
This approach could work and, in some scenarios, be useful. Yet it is not a learning process because it lacks an important aspect of learning: the ability to generalize, to transform the individual examples into a broader model. In this specific use case, it means the ability to label unseen emails even though they are dissimilar to previously labeled emails. This process is also referred to as inductive reasoning or inductive inference.2 To generalize, the algorithm should scan the training data and extract a set of words whose appearance in an email message is indicative of spam. Then, for a new email, the agent would check whether one or more of the suspicious words appear and predict its label accordingly.
If you are an experienced developer, you might be wondering, “Why should I write a program that learns how to program itself, when I can instruct the computer to carry out the task at hand?” Taking the example of the spam filter, it is possible to write a program that checks for the occurrence of some words and classifies an email as spam if those words are present. But this approach has three primary disadvantages:
  • A developer cannot anticipate all possible situations. In the spam-filter use case, all the words that might be used in a spam email cannot be predicted up front.
  • A developer cannot anticipate all changes over time. In spam emails, new words can be used, or techniques can be adopted to avoid easy recognition, such as adding hyphens or spaces between characters.
  • Sometimes, a developer cannot write a program to accomplish the task. Even though recognizing the face of a friend is a simple task for a human, for example, it is impossible to program software to accomplish this task without the use of machine learning.
Therefore, when you face new problems or tasks that you would like to solve with a computer program, the following questions can help you decide whether to use machine learning:
  • Is the specific task too complex to be programmed?
  • Does the task require any sort of adaptivity throughout its life?
A crucial aspect of any machine learning task is the training data on which the knowledge is built. Starting from the wrong data leads to the wrong results, regardless of the potential performance or the quality of the learning algorithm used.
The aim of this book is to help data scientists and data engineers approach the machine learning process from two sides: the learning algorithm and the data. In both perspectives, we will use the graph (let me introduce it now as a set of nodes and relationships connec...

Indice dei contenuti