Machine Learning
eBook - ePub

Machine Learning

An Algorithmic Perspective, Second Edition

Stephen Marsland

Buch teilen
  1. 457 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfĂŒgbar
eBook - ePub

Machine Learning

An Algorithmic Perspective, Second Edition

Stephen Marsland

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

A Proven, Hands-On Approach for Students without a Strong Statistical FoundationSince the best-selling first edition was published, there have been several prominent developments in the field of machine learning, including the increasing work on the statistical interpretations of machine learning algorithms. Unfortunately, computer science students

HĂ€ufig gestellte Fragen

Wie kann ich mein Abo kĂŒndigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kĂŒndigen“ – ganz einfach. Nachdem du gekĂŒndigt hast, bleibt deine Mitgliedschaft fĂŒr den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich BĂŒcher herunterladen?
Derzeit stehen all unsere auf MobilgerĂ€te reagierenden ePub-BĂŒcher zum Download ĂŒber die App zur VerfĂŒgung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die ĂŒbrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den AboplÀnen?
Mit beiden AboplÀnen erhÀltst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst fĂŒr LehrbĂŒcher, bei dem du fĂŒr weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhĂ€ltst. Mit ĂŒber 1 Million BĂŒchern zu ĂŒber 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
UnterstĂŒtzt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nÀchsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Machine Learning als Online-PDF/ePub verfĂŒgbar?
Ja, du hast Zugang zu Machine Learning von Stephen Marsland im PDF- und/oder ePub-Format sowie zu anderen beliebten BĂŒchern aus Computer Science & Computer Science General. Aus unserem Katalog stehen dir ĂŒber 1 Million BĂŒcher zur VerfĂŒgung.

Information

Jahr
2014
ISBN
9781498759786

CHAPTER 1

Introduction

Suppose that you have a website selling software that you’ve written. You want to make the website more personalised to the user, so you start to collect data about visitors, such as their computer type/operating system, web browser, the country that they live in, and the time of day they visited the website. You can get this data for any visitor, and for people who actually buy something, you know what they bought, and how they paid for it (say PayPal or a credit card). So, for each person who buys something from your website, you have a list of data that looks like (computer type, web browser, country, time, software bought, how paid). For instance, the first three pieces of data you collect could be:
  • Macintosh OS X, Safari, UK, morning, SuperGamel, credit card
  • Windows XP, Internet Explorer, USA, afternoon, SuperGamel, PayPal
  • Windows Vista, Firefox, NZ, evening, SuperGame2, PayPal
Based on this data, you would like to be able to populate a ‘Things You Might Be Interested In’ box within the webpage, so that it shows software that might be relevant to each visitor, based on the data that you can access while the webpage loads, i.e., computer and OS, country, and the time of day. Your hope is that as more people visit your website and you store more data, you will be able to identify trends, such as that Macintosh users from New Zealand (NZ) love your first game, while Firefox users, who are often more knowledgeable about computers, want your automatic download application and virus/internet worm detector, etc.
Once you have collected a large set of such data, you start to examine it and work out what you can do with it. The problem you have is one of prediction: given the data you have, predict what the next person will buy, and the reason that you think that it might work is that people who seem to be similar often act similarly. So how can you actually go about solving the problem? This is one of the fundamental problems that this book tries to solve. It is an example of what is called supervised learning, because we know what the right answers are for some examples (the software that was actually bought) so we can give the learner some examples where we know the right answer. We will talk about supervised learning more in Section 1.3.

1.1 If Data Had Mass, The Earth Would Be a Black Hole

Around the world, computers capture and store terabytes of data every day. Even leaving aside your collection of MP3s and holiday photographs, there are computers belonging to shops, banks, hospitals, scientific laboratories, and many more that are storing data incessantly. For example, banks are building up pictures of how people spend their money, hospitals are recording what treatments patients are on for which ailments (and how they respond to them), and engine monitoring systems in cars are recording information about the engine in order to detect when it might fail. The challenge is to do something useful with this data: if the bank’s computers can learn about spending patterns, can they detect credit card fraud quickly? If hospitals share data, then can treatments that don’t work as well as expected be identified quickly? Can an intelligent car give you early warning of problems so that you don’t end up stranded in the worst part of town? These are some of the questions that machine learning methods can be used to answer.
Science has also taken advantage of the ability of computers to store massive amounts of data. Biology has led the way, with the ability to measure gene expression in DNA microar-rays producing immense datasets, along with protein transcription data and phylogenetic trees relating species to each other. However, other sciences have not been slow to follow. Astronomy now uses digital telescopes, so that each night the world’s observatories are storing incredibly high-resolution images of the night sky; around a terabyte per night. Equally, medical science stores the outcomes of medical tests from measurements as diverse as magnetic resonance imaging (MRI) scans and simple blood tests. The explosion in stored data is well known; the challenge is to do something useful with that data. The Large Hadron Collider at CERN apparently produces about 25 petabytes of data per year.
The size and complexity of these datasets mean that humans are unable to extract useful information from them. Even the way that the data is stored works against us. Given a file full of numbers, our minds generally turn away from looking at them for long. Take some of the same data and plot it in a graph and we can do something. Compare the table and graph shown in Figure 1.1: the graph is rather easier to look at and deal with. Unfortunately, our three-dimensional world doesn’t let us do much with data in higher dimensions, and even the simple webpage data that we collected above has four different features, so if we plotted it with one dimension for each feature we’d need four dimensions! There are two things that we can do with this: reduce the number of dimensions (until our simple brains can deal with the problem) or use computers, which don’t know that high-dimensional problems are difficult, and don’t get bored with looking at massive data files of numbers. The two pictures in Figure 1.2 demonstrate one problem with reducing the number of dimensions (more technically, projecting it into fewer dimensions), which is that it can hide useful information and make things look rather strange. This is one reason why machine learning is becoming so popular — the problems of our human limitations go away if we can make computers do the dirty work for us. There is one other thing that can help if the number of dimensions is not too much larger than three, which is to use glyphs that use other representations, such as size or colour of the datapoints to represent information about some other dimension, but this does not help if the dataset has 100 dimensions in it.
Image
FIGURE 1.1 A set of datapoints as numerical values and as points plotted on a graph. It is easier for us to visualise data than to see it in a table, but if the data has more than three dimensions, we can’t view it all at once.
Image
FIGURE 1.2 Two views of the same two wind turbines (Te Apiti wind farm, Ashhurst, New Zealand) taken at an angle of about 30° to each other. The two-dimensional projections of three-dimensional objects hides information.
In fact, you have probably interacted with machine learning algorithms at some time. They are used in many of the software programs that we use, such as Microsoft’s infamous paperclip in Office (maybe not the most positive example), spam filters, voice recognition software, and lots of computer games. They are also part of automatic number-plate recognition systems for petrol station security cameras and toll roads, are used in some anti-skid braking and vehicle stability systems, and they are even part of the set of algorithms that decide whether a bank will give you a loan.
The attention-grabbing title to this section would only be true if data was very heavy. It is very hard to work out how much data there actually is in all of the world’s computers, but it was estimated in 2012 that was about 2.8 zettabytes (2.8 × 1021 bytes), up from about 160 exabytes (160 × 1018 bytes) of data that were created and stored in 2006, and projected to reach 40 zettabytes by 2020. However, to make a black hole the size of the earth would take a mass of about 40 × 10 grams. So data would have to be so heavy that you couldn’t possibly lift a data pen, let alone a computer before the section title were true! However, and more interestingly for machine learning, the same report that estimated the figure of 2.8 zettabytes (‘Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East’ by John Gantz and David Reinsel and sponsored by EMC Corporation) also reported that while a quarter of this data could produce useful information, only around 3% of it was tagged, and less that 0.5% of it was actually used for analysis!

1.2 Learning

Before we delve too much further into the topic, let’s step back and think about what learning actually is. The key concept that we will need to think about for our machines is learning from data, since data is what we have; terabytes of it, in some cases. However, it isn’t too large a step to put that into human behavioural terms, and talk about learning from experience. Hopefully, we all agree that humans and other animals can display behaviours that we label as intelligent by learning from experience. Learning is what gives us flexibility in our life; the fact that we can adjust and adapt to new circumstances, and learn new tricks, no matter how old a dog we are! The important parts of animal learning for this book are remembering, adapting, and generalising: recognising that last time we were in this situation (saw this data) we tried out some particular action (gave this output) and it worked (was correct), so we’ll try it again, or it didn’t work, so we’ll try something different. The last word, generalising, is about recognising similarity between different situations, so that things that applied in one place can be used in another. This is what makes learning useful, because we can use our knowledge in lots of different places.
Of course, there are plenty of other bits to intelligence, such as reasoning, and logical deduction, but we won’t worry too much about those. We are interested in the most fundamental parts of intelligence—learning and adapting—and how we can model them in a computer. There has also been a lot of interest in making computers reason and deduce facts. This was the basis of most early Artificial Intelligence, and is sometimes known as symbolic processing because the computer manipulates symbols that reflect the environment. In contrast, machine learning methods are sometimes called subsymbolic because no symbols or symbolic manipulation are involved.

1.2.1 Machine Learning

Machine learning, then, is about making computers modify or adapt their actions (whether these actions are making predictions, or controlling a robot) so that these actions get more accurate, where accuracy is measured by how well the chosen actions reflect the correct ones. Imagin...

Inhaltsverzeichnis