Machine Learning
eBook - ePub

Machine Learning

a Concise Introduction

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Machine Learning

a Concise Introduction

About this book

AN INTRODUCTION TO MACHINE LEARNING THAT INCLUDES THE FUNDAMENTAL TECHNIQUES, METHODS, AND APPLICATIONS

PROSE Award Finalist 2019
Association of American Publishers Award for Professional and Scholarly Excellence

Machine Learning: a Concise Introduction offers a comprehensive introduction to the core concepts, approaches, and applications of machine learning. The author—an expert in the field—presents fundamental ideas, terminology, and techniques for solving applied problems in classification, regression, clustering, density estimation, and dimension reduction. The design principles behind the techniques are emphasized, including the bias-variance trade-off and its influence on the design of ensemble methods. Understanding these principles leads to more flexible and successful applications. Machine Learning: a Concise Introduction also includes methods for optimization, risk estimation, and model selection— essential elements of most applied projects. This important resource:

  • Illustrates many classification methods with a single, running example, highlighting similarities and differences between methods
  • Presents R source code which shows how to apply and interpret many of the techniques covered
  • Includes many thoughtful exercises as an integral part of the text, with an appendix of selected solutions
  • Contains useful information for effectively communicating with clients

A volume in the popular Wiley Series in Probability and Statistics, Machine Learning: a Concise Introduction offers the practical information needed for an understanding of the methods and application of machine learning.

STEVEN W. KNOX holds a Ph.D. in Mathematics from the University of Illinois and an M.S. in Statistics from Carnegie Mellon University. He has over twenty years' experience in using Machine Learning, Statistics, and Mathematics to solve real-world problems. He currently serves as Technical Director of Mathematics Research and Senior Advocate for Data Science at the National Security Agency.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Machine Learning by Steven W. Knox in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Mining. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2018
Print ISBN
9781119439196
eBook ISBN
9781119438984

1
Introduction—Examples from Real Life

To call in a statistician after the experiment is done may be no more than asking him to perform a postmortem examination: he may be able to say what the experiment died of.
—R. A. Fisher, Presidential Address, 1938
The following examples will be used to illustrate the ideas of the next chapter.
Problem 1 (ā€œShuttleā€). The space shuttle is set to launch. For every previous launch, the air temperature is known and the number of O-rings on the solid rocket boosters which were damaged is known (there are six O-rings, and O-ring damage is a potentially catastrophic event). Based on the current air temperature, estimate the probability that at least one O-ring on a solid rocket booster will be damaged if the shuttle launches now.
This is a regression problem. Poor analysis, and poor communication of some good analysis (Tufte, 2001), resulted in the loss of the shuttle Challenger and its crew on January 28, 1986.
Problem 2 (ā€œBallotā€). Immediately after the 2000 US presidential election, some voters in Palm Beach County, Florida, claimed that a confusing ballot form caused them to vote for Pat Buchanan, the Reform Party candidate, when they thought they were voting for Al Gore, the Democratic Party candidate. Based on county-by-county demographic information (number of registered members of each political party, number of people with annual income in a certain range, number of people with a certain level of education, etc.) and county-by-county vote counts from the 1996 presidential election, estimate how many people in Palm Beach County voted for Buchanan but thought they were voting for Gore.
This regression problem was studied a great deal in 2000 and 2001, as the outcome of the vote in Palm Beach County could have decided the election.
Problem 3 (ā€œHeartā€). A patient who is suffering from acute chest pain has entered a hospital, where several numerical variables (for example, systolic blood pressure, age) and several binary variables (for example, whether tachycardia present or not) are measured. Identify the patient as ā€œhigh riskā€ (probably will die within 30 days) or ā€œlow riskā€ (probably will live 30 days).
This is a classification problem.
Problem 4 (ā€œPostal Codeā€). An optical scanner has scanned a hand-written ZIP code on a piece of mail. It has approximately separated the digits, and each digit is represented as an 8 Ɨ 8 array of pixels, each of which has one of 256 gray-scale values, 0 (white), ..., 255 (black). Identify each pixel array as one of the digits 0 through 9.
This is a classification problem which affects all of us (though not so much now as formerly).
Problem 5 (ā€œSpamā€). Identify email as ā€œspamā€ or ā€œnot spam,ā€ based only on the subject line. Or based on the full header. Or based on the content of the email.
This is probably the best known and most studied classification problem of all, solutions to which are applied many billions of times per day.1
Problem 6 (ā€œVaultā€). Some neolithic tribes built dome-shaped stone burial vaults. Given the location and several internal measurements of some burial vaults, estimate how many distinct vault-building cultures there have been, say which vaults were built by which culture and, for each culture, give the dimensions of a vault which represents that culture’s ideal vault shape (or name the actual vault which best realizes each culture’s ideal).
This is a clustering problem.

Notes

1 In 2013, approximately 182.9 billion emails were sent per day, on average, worldwide (Radicati and Levenstein, 2013).

2
The Problem of Learning

Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.
—John Tukey, The Future of Data Analysis, 1962
This book treats The Problem of Learning, which can be stated generally and succinctly as follows.
The Problem of Learning. There are a known set
and an unknown function f on
. Given data, construct a good approximation
of f. This is called learning f.
The problem of learning has been studied in many guises and in different fields, such as statistics, computer science, mathematics, and the natural and social...

Table of contents

  1. Cover
  2. Title page
  3. Copyright
  4. Preface
  5. Organization—How to Use This Book
  6. Acknowledgments
  7. About the Companion Website
  8. Chapter 1: Introduction—Examples from Real Life
  9. Chapter 2: The Problem of Learning
  10. Chapter 3: Regression
  11. Chapter 4: Survey of Classification Techniques
  12. Chapter 5: Bias–Variance Trade-off
  13. Chapter 6: Combining Classifiers
  14. Chapter 7: Risk Estimation and ModelĀ Selection
  15. Chapter 8: Consistency
  16. Chapter 9: Clustering
  17. Chapter 10: Optimization
  18. Chapter 11: High-Dimensional Data
  19. Chapter 12: Communication with Clients
  20. Chapter 13: Current Challenges in Machine Learning
  21. Chapter 14: R Source Code
  22. Appendix A: List of Symbols
  23. Appendix B: Solutions to Selected Exercises
  24. Appendix C: Converting Between Normal Parameters and Level-Curve Ellipsoids
  25. Appendix D: Training Data and Fitted Parameters
  26. References
  27. Index
  28. EULA