Practical Machine Learning
eBook - ePub

Practical Machine Learning

Sunila Gollapudi

Share book
  1. 468 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Practical Machine Learning

Sunila Gollapudi

Book details
Book preview
Table of contents
Citations

About This Book

Tackle the real-world complexities of modern machine learning with innovative, cutting-edge, techniques

About This Book

  • Fully-coded working examples using a wide range of machine learning libraries and tools, including Python, R, Julia, and Spark
  • Comprehensive practical solutions taking you into the future of machine learning
  • Go a step further and integrate your machine learning projects with Hadoop

Who This Book Is For

This book has been created for data scientists who want to see machine learning in action and explore its real-world application. With guidance on everything from the fundamentals of machine learning and predictive analytics to the latest innovations set to lead the big data revolution into the future, this is an unmissable resource for anyone dedicated to tackling current big data challenges. Knowledge of programming (Python and R) and mathematics is advisable if you want to get started immediately.

What You Will Learn

  • Implement a wide range of algorithms and techniques for tackling complex data
  • Get to grips with some of the most powerful languages in data science, including R, Python, and Julia
  • Harness the capabilities of Spark and Hadoop to manage and process data successfully
  • Apply the appropriate machine learning technique to address real-world problems
  • Get acquainted with Deep learning and find out how neural networks are being used at the cutting-edge of machine learning
  • Explore the future of machine learning and dive deeper into polyglot persistence, semantic data, and more

In Detail

Finding meaning in increasingly larger and more complex datasets is a growing demand of the modern world. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. Machine learning uses complex algorithms to make improved predictions of outcomes based on historical patterns and the behaviour of data sets. Machine learning can deliver dynamic insights into trends, patterns, and relationships within data, immensely valuable to business growth and development.

This book explores an extensive range of machine learning techniques uncovering hidden tricks and tips for several types of data using practical and real-world examples. While machine learning can be highly theoretical, this book offers a refreshing hands-on approach without losing sight of the underlying principles. Inside, a full exploration of the various algorithms gives you high-quality guidance so you can begin to see just how effective machine learning is at tackling contemporary challenges of big data.

This is the only book you need to implement a whole suite of open source tools, frameworks, and languages in machine learning. We will cover the leading data science languages, Python and R, and the underrated but powerful Julia, as well as a range of other big data platforms including Spark, Hadoop, and Mahout. Practical Machine Learning is an essential resource for the modern data scientists who want to get to grips with its real-world application.

With this book, you will not only learn the fundamentals of machine learning but dive deep into the complexities of real world data before moving on to using Hadoop and its wider ecosystem of tools to process and manage your structured and unstructured data.

You will explore different machine learning techniques for both supervised and unsupervised learning; from decision trees to Naive Bayes classifiers and linear and clustering methods, you will learn strategies for a truly advanced approach to the statistical analysis of data. The book also explores the cutting-edge advancements in machine learning, with worked examples and guidance on deep learning and reinforcement learning, providing you with practical demonstrations and samples that help take the theory–and mystery–out of even the most advanced machine learning methodologies.

Style and approach

A practical data science tutorial designed to give you an insight into the practical application of machine learning, this book takes you through complex concepts and tasks in an accessible way. Featuring information on a wide range of data science techniques, Practical Machine Learning is a comprehensive data science resource.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Practical Machine Learning an online PDF/ePUB?
Yes, you can access Practical Machine Learning by Sunila Gollapudi in PDF and/or ePUB format, as well as other popular books in Business & Business Intelligence. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9781784399689
Edition
1

Practical Machine Learning


Table of Contents

Practical Machine Learning
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Introduction to Machine learning
Machine learning
Definition
Core Concepts and Terminology
What is learning?
Data
Labeled and unlabeled data
Tasks
Algorithms
Models
Logical models
Geometric models
Probabilistic models
Data and inconsistencies in Machine learning
Under-fitting
Over-fitting
Data instability
Unpredictable data formats
Practical Machine learning examples
Types of learning problems
Classification
Clustering
Forecasting, prediction or regression
Simulation
Optimization
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Deep learning
Performance measures
Is the solution good?
Mean squared error (MSE)
Mean absolute error (MAE)
Normalized MSE and MAE (NMSE and NMAE)
Solving the errors: bias and variance
Some complementing fields of Machine learning
Data mining
Artificial intelligence (AI)
Statistical learning
Data science
Machine learning process lifecycle and solution architecture
Machine learning algorithms
Decision tree based algorithms
Bayesian method based algorithms
Kernel method based algorithms
Clustering methods
Artificial neural networks (ANN)
Dimensionality reduction
Ensemble methods
Instance based learning algorithms
Regression analysis based algorithms
Association rule based learning algorithms
Machine learning tools and frameworks
Summary
2. Machine learning and Large-scale datasets
Big data and the context of large-scale Machine learning
Functional versus Structural – A methodological mismatch
Commoditizing information
Theoretical limitations of RDBMS
Scaling-up versus Scaling-out storage
Distributed and parallel computing strategies
Machine learning: Scalability and Performance
Too many data points or instances
Too many attributes or features
Shrinking response time windows – need for real-time responses
Highly complex algorithm
Feed forward, iterative prediction cycles
Model selection process
Potential issues in large-scale Machine learning
Algorithms and Concurrency
Developing concurrent algorithms
Technology and implementation options for scaling-up Machine learning
MapReduce programming paradigm
High Performance Computing (HPC) with Message Passing Interface (MPI)
Language Integrated Queries (LINQ) framework
Manipulating datasets with LINQ
Graphics Processing Unit (GPU)
Field Programmable Gate Array (FPGA)
Multicore or multiprocessor systems
Summary
3. An Introduction to Hadoop's Architecture and Ecosystem
Introduction to Apache Hadoop
Evolution of Hadoop (the platform of choice)
Hadoop and its core elements
Machine learning solution architecture for big data (employing Hadoop)
The Data Source layer
The Ingestion layer
The Hadoop Storage layer
The Hadoop (Physical) Infrastructure layer – supporting appliance
Hadoop platform / Processing layer
The Analytics layer
The Consumption layer
Explaining and exploring data with Visualizations
Security and Monitoring layer
Hadoop core components framework
Hadoop Distributed File System (HDFS)
Secondary Namenode and Checkpoint process
Splitting large data files
Block loading to the cluster and replication
Writing to and reading from HDFS
Handling failures
HDFS command line
RESTFul HDFS
MapReduce
MapReduce architecture
What makes MapReduce cater to the needs of large datasets?
MapReduce execution flow and components
Developing MapReduce components
InputFormat
OutputFormat
Mapper implementation
Hadoop 2.x
Hadoop ecosystem components
Hadoop installation and setup
Installing Jdk 1.7
Creating a system user for Hadoop (dedicated)
Disable IPv6
Steps for installing Hadoop 2.6.0
Starting Hadoop
Hadoop distributions and vendors
Summary
4. Machine Learning Tools, Libraries, and Frameworks
Machine learning tools – A landscape
Apache Mahout
How does Mahout work?
Installing and setting up Apache Mahout
Setting up Maven
Setting-up Apache Mahout using Eclipse IDE
Setting up Apache Mahout without Eclipse
Mahout Packages
Implementing vectors in Mahout
R
Installing and setting up R
Integrating R with Apache Hadoop
Approach 1 – Using R and Streaming APIs in Hadoop
Approach 2 – Using the Rhipe package of R
Approach 3 ...

Table of contents