eBook - ePub
Machine Learning in Action
Peter Harrington
This is a test
Partager le livre
- English
- ePUB (adapté aux mobiles)
- Disponible sur iOS et Android
eBook - ePub
Machine Learning in Action
Peter Harrington
DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations
Ă propos de ce livre
Machine Learning in Action is unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. You'll use the flexible Python programming language to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.
Foire aux questions
Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier lâabonnement ». Câest aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via lâapplication. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă la bibliothĂšque et Ă toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode dâabonnement : avec lâabonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă 12 mois dâabonnement mensuel.
Quâest-ce que Perlego ?
Nous sommes un service dâabonnement Ă des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă toute une bibliothĂšque pour un prix infĂ©rieur Ă celui dâun seul livre par mois. Avec plus dâun million de livres sur plus de 1 000 sujets, nous avons ce quâil vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Ăcouter sur votre prochain livre pour voir si vous pouvez lâĂ©couter. Lâoutil Ăcouter lit le texte Ă haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, lâaccĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Machine Learning in Action est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă Machine Learning in Action par Peter Harrington en format PDF et/ou ePUB ainsi quâĂ dâautres livres populaires dans Informatik et KĂŒnstliche Intelligenz (KI) & Semantik. Nous disposons de plus dâun million dâouvrages Ă dĂ©couvrir dans notre catalogue.
Informations
Sujet
InformatikSous-sujet
KĂŒnstliche Intelligenz (KI) & SemantikPart 1. Classification
The first two parts of this book are on supervised learning. Supervised learning asks the machine to learn from our data when we specify a target variable. This reduces the machineâs task to only divining some pattern from the input data to get the target variable.
We address two cases of the target variable. The first case occurs when the target variable can take only nominal values: true or false; reptile, fish, mammal, amphibian, plant, fungi. The second case of classification occurs when the target variable can take an infinite number of numeric values, such as 0.100, 42.001, 1000.743,.... This case is called regression. Weâll study regression in part 2 of this book. The first part of this book focuses on classification.
Our study of classification algorithms covers the first seven chapters of this book. Chapter 2 introduces one of the simplest classification algorithms called k-Nearest Neighbors, which uses a distance metric to classify items. Chapter 3 introduces an intuitive yet slightly harder to implement algorithm: decision trees. In chapter 4 we address how we can use probability theory to build a classifier. Next, chapter 5 looks at logistic regression, where we find the best parameters to properly classify our data. In the process of finding these best parameters, we encounter some powerful optimization algorithms. Chapter 6 introduces the powerful support vector machines. Finally, in chapter 7 we see a meta-algorithm, AdaBoost, which is a classifier made up of a collection of classifiers. Chapter 7 concludes part 1 on classification with a section on classification imbalance, which is a real-world problem where you have more data from one class than other classes.
Chapter 1. Machine learning basics
This chapter covers
- A brief overview of machine learning
- Key tasks in machine learning
- Why you need to learn about machine learning
- Why Python is so great for machine learning
I was eating dinner with a couple when they asked what I was working on recently. I replied, âMachine learning.â The wife turned to the husband and said, âHoney, whatâs machine learning?â The husband replied, âCyberdyne Systems T-800.â If you arenât familiar with the Terminator movies, the T-800 is artificial intelligence gone very wrong. My friend was a little bit off. Weâre not going to attempt to have conversations with computer programs in this book, nor are we going to ask a computer the meaning of life. With machine learning we can gain insight from a dataset; weâre going to ask the computer to make some sense from data. This is what we mean by learning, not cyborg rote memorization, and not the creation of sentient beings.
Machine learning is actively being used today, perhaps in many more places than youâd expect. Hereâs a hypothetical day and the many times youâll encounter machine learning: You realize itâs your friendâs birthday and want to send her a card via snail mail. You search for funny cards, and the search engine shows you the 10 most relevant links. You click the second link; the search engine learns from this. Next, you check some email, and without your noticing it, the spam filter catches unsolicited ads for pharmaceuticals and places them in the Spam folder. Next, you head to the store to buy the birthday card. When youâre shopping for the card, you pick up some diapers for your friendâs child. When you get to the checkout and purchase the items, the human operating the cash register hands you a coupon for $1 off a six-pack of beer. The cash registerâs software generated this coupon for you because people who buy diapers also tend to buy beer. You send the birthday card to your friend, and a machine at the post office recognizes your handwriting to direct the mail to the proper delivery truck. Next, you go to the loan agent and ask them if you are eligible for loan; they donât answer but plug some financial information about you into the computer and a decision is made. Finally, you head to the casino for some late-night entertainment, and as you walk in the door, the person walking in behind you gets approached by security seemingly out of nowhere. They tell him, âSorry, Mr. Thorp, weâre going to have to ask you to leave the casino. Card counters arenât welcome here.â Figure 1.1 illustrates where some of these applications are being used.
Figure 1.1. Examples of machine learning in action today, clockwise from top left: face recognition, handwriting digit recognition, spam filtering in email, and product recommendations from Amazon.com
In all of the previously mentioned scenarios, machine learning was present. Companies are using it to improve business decisions, increase productivity, detect disease, forecast weather, and do many more things. With the exponential growth of technology, we not only need better tools to understand the data we currently have, but we also need to prepare ourselves for the data we will have.
Are you ready for machine learning? In this chapter youâll find out what machine learning is, where itâs already being used around you, and how it might help you in the future. Next, weâll talk about some common approaches to solving problems with machine learning. Last, youâll find out why Python is so great and why itâs a great language for machine learning. Then weâll go through a really quick example using a module for Python called NumPy, which allows you to abstract and matrix calculations.
1.1. What is machine learning?
In all but the most trivial cases, insight or knowledge youâre trying to get out of the raw data wonât be obvious from looking at the data. For example, in detecting spam email, looking for the occurrence of a single word may not be very helpful. But looking at the occurrence of certain words used together, combined with the length of the email and other factors, you could get a much clearer picture of whether the email is spam or not. Machine learning is turning data into information.
Machine learning lies at the intersection of computer science, engineering, and statistics and often appears in other disciplines. As youâll see later, it can be applied to many fields from politics to geosciences. Itâs a tool that can be applied to many problems. Any field that needs to interpret and act on data can benefit from machine learning techniques.
Machine learning uses statistics. To most people, statistics is an esoteric subject used for companies to lie about how great their products are. (Thereâs a great manual on how to do this called How to Lie with Statistics by Darrell Huff. Ironically, this is the best-selling statistics book of all time.) So why do the rest of us need statistics? The practice of engineering is applying science to solve a problem. In engineering weâre used to solving a deterministic problem where our solution solves the problem all the time. If weâre asked to write software to control a vending machine, it had better work all the time, regardless of the money entered or the buttons pressed. There are many problems where the solution isnât deterministic. That is, we donât know enough about the problem or donât have enough computing power to properly model the problem. For these problems we need statistics. For example, the motivation of humans is a problem that is currently too difficult to model.
In the social sciences, being right 60% of the time is considered successful. If we can predict the way people will behave 60% of the time, weâre doing well. How can this be? Shouldnât we be right all the time? If weâre not right all the time, doesnât that mean weâre doing something wrong?
Let me give you an example to illustrate the problem of not being able to model the problem fully. Do humans not act to maximize their own happiness? Canât we just predict the outcome of events involving humans based on this assumption? Perhaps, but itâs difficult to define what makes everyone happy, because this may differ greatly from one person to the next. So even if our assumptions are correct about people maximizing their own happiness, the definition of happiness is too complex to model. There are many other examples outside human behavior that we canât currently model deterministically. For these problems we need to use some tools from statistics.
1.1.1. Sensors and the data deluge
We have a tremendous amount of human-created data from the World Wide Web, but recently more nonhuman sources of data have been coming online. The technology behind the sensors isnât new, but connecting them to the web is new. Itâs estimated that shortly after this bookâs publication physical sensors will create 20 percent of non-video internet traffic.[1]
1http://www.gartner.com/it/page.jsp?id=876512, retrieved 7/29/2010 4:36 a.m.
The following is an example of an abundance of free data, a worthy cause, and the need to sort through the data. In 1989, the Loma Prieta earthquake struck northern California, killing 63 people, injuring 3,757, and leaving thousands homeless. A similarly sized earthquake struck Haiti in 2010, killing more than 230,000 people. Shortly after the Loma Prieta earthquake, a study was published using low-frequency magnetic field measurements c...