Advanced Machine Learning with Python
eBook - ePub

Advanced Machine Learning with Python

John Hearty

Share book
  1. 278 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Advanced Machine Learning with Python

John Hearty

Book details
Book preview
Table of contents
Citations

About This Book

Solve challenging data science problems by mastering cutting-edge machine learning techniques in Python

About This Book

  • Resolve complex machine learning problems and explore deep learning
  • Learn to use Python code for implementing a range of machine learning algorithms and techniques
  • A practical tutorial that tackles real-world computing problems through a rigorous and effective approach

Who This Book Is For

This title is for Python developers and analysts or data scientists who are looking to add to their existing skills by accessing some of the most powerful recent trends in data science. If you've ever considered building your own image or text-tagging solution, or of entering a Kaggle contest for instance, this book is for you!

Prior experience of Python and grounding in some of the core concepts of machine learning would be helpful.

What You Will Learn

  • Compete with top data scientists by gaining a practical and theoretical understanding of cutting-edge deep learning algorithms
  • Apply your new found skills to solve real problems, through clearly-explained code for every technique and test
  • Automate large sets of complex data and overcome time-consuming practical challenges
  • Improve the accuracy of models and your existing input data using powerful feature engineering techniques
  • Use multiple learning techniques together to improve the consistency of results
  • Understand the hidden structure of datasets using a range of unsupervised techniques
  • Gain insight into how the experts solve challenging data problems with an effective, iterative, and validation-focused approach
  • Improve the effectiveness of your deep learning models further by using powerful ensembling techniques to strap multiple models together

In Detail

Designed to take you on a guided tour of the most relevant and powerful machine learning techniques in use today by top data scientists, this book is just what you need to push your Python algorithms to maximum potential. Clear examples and detailed code samples demonstrate deep learning techniques, semi-supervised learning, and more - all whilst working with real-world applications that include image, music, text, and financial data.

The machine learning techniques covered in this book are at the forefront of commercial practice. They are applicable now for the first time in contexts such as image recognition, NLP and web search, computational creativity, and commercial/financial data modeling. Deep Learning algorithms and ensembles of models are in use by data scientists at top tech and digital companies, but the skills needed to apply them successfully, while in high demand, are still scarce.

This book is designed to take the reader on a guided tour of the most relevant and powerful machine learning techniques. Clear descriptions of how techniques work and detailed code examples demonstrate deep learning techniques, semi-supervised learning and more, in real world applications. We will also learn about NumPy and Theano.

By this end of this book, you will learn a set of advanced Machine Learning techniques and acquire a broad set of powerful skills in the area of feature selection & feature engineering.

Style and approach

This book focuses on clarifying the theory and code behind complex algorithms to make them practical, useable, and well-understood. Each topic is described with real-world applications, providing both broad contextual coverage and detailed guidance.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Advanced Machine Learning with Python an online PDF/ePUB?
Yes, you can access Advanced Machine Learning with Python by John Hearty in PDF and/or ePUB format, as well as other popular books in Informatica & Elaborazione di dati. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9781784398637

Advanced Machine Learning with Python


Table of Contents

Advanced Machine Learning with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What is advanced machine learning?
What should you expect from this book?
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Unsupervised Machine Learning
Principal component analysis
PCA – a primer
Employing PCA
Introducing k-means clustering
Clustering – a primer
Kick-starting clustering analysis
Tuning your clustering configurations
Self-organizing maps
SOM – a primer
Employing SOM
Further reading
Summary
2. Deep Belief Networks
Neural networks – a primer
The composition of a neural network
Network topologies
Restricted Boltzmann Machine
Introducing the RBM
Topology
Training
Applications of the RBM
Further applications of the RBM
Deep belief networks
Training a DBN
Applying the DBN
Validating the DBN
Further reading
Summary
3. Stacked Denoising Autoencoders
Autoencoders
Introducing the autoencoder
Topology
Training
Denoising autoencoders
Applying a dA
Stacked Denoising Autoencoders
Applying the SdA
Assessing SdA performance
Further reading
Summary
4. Convolutional Neural Networks
Introducing the CNN
Understanding the convnet topology
Understanding convolution layers
Understanding pooling layers
Training a convnet
Putting it all together
Applying a CNN
Further Reading
Summary
5. Semi-Supervised Learning
Introduction
Understanding semi-supervised learning
Semi-supervised algorithms in action
Self-training
Implementing self-training
Finessing your self-training implementation
Improving the selection process
Contrastive Pessimistic Likelihood Estimation
Further reading
Summary
6. Text Feature Engineering
Introduction
Text feature engineering
Cleaning text data
Text cleaning with BeautifulSoup
Managing punctuation and tokenizing
Tagging and categorising words
Tagging with NLTK
Sequential tagging
Backoff tagging
Creating features from text data
Stemming
Bagging and random forests
Testing our prepared data
Further reading
Summary
7. Feature Engineering Part II
Introduction
Creating a feature set
Engineering features for ML applications
Using rescaling techniques to improve the learnability of features
Creating effective derived variables
Reinterpreting non-numeric features
Using feature selection techniques
Performing feature selection
Correlation
LASSO
Recursive Feature Elimination
Genetic models
Feature engineering in practice
Acquiring data via RESTful APIs
Testing the performance of our model
Twitter
Translink Twitter
Consumer comments
The Bing Traffic API
Deriving and selecting variables using feature engineering techniques
The weather API
Further reading
Summary
8. Ensemble Methods
Introducing ensembles
Understanding averaging ensembles
Using bagging algorithms
Using random forests
Applying boosting methods
Using XGBoost
Using stacking ensembles
Applying ensembles in practice
Using models in dynamic applications
Understanding model robustness
Identifying modeling risk factors
Strategies to managing model robustness
Further reading
Summary
9. Additional Python Machine Learning Tools
Alternative development tools
Introduction to Lasagne
Getting to know Lasagne
Introduction to TensorFlow
Getting to know TensorFlow
Using TensorFlow to iteratively improve our models
Knowing when to use these libraries
Further reading
Summary
A. Chapter Code Requirements
Index

Advanced Machine Learning with Python

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2016
Production reference: 1220716
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78439-863-7
www.packtpub.com

Credits

Author
John Hearty
Reviewers
Jared Huffman
Ashwin Pajankar
Commissioning Editor
Akram Hussain
Acquisition Editor
Sonali Vernekar
Content Development Editor
Mayur Pawanikar
Technical Editor
Suwarna Patil
Copy Editor
Tasneem Fatehi
Project Coordinator
Nidhi Joshi
Proofreader
Safis Editing
Indexer
Mariammal Chettiyar
Graphics
Disha Haria
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta

About the Author

John Hearty is a consultant in digital industries with substantial expertise in data science and infrastructure engineering. Having started out in mobile gaming, he was drawn to the challenge of AAA console analytics.
Keen to start putting advanced machine learning techniques into practice, he signed on with Microsoft to develop player modelling capabilities and big data infrastructure at an Xbox studio. His team made significant strides in engineering and data science that were replicated across Microsoft Studios. Some of the more rewarding initiatives he led included player skill modelling in asymmetrical games, and the creation of player segmentation models for individualized game experiences.
Eventually John struck out on his own as a consultant offering comprehensive infrastructure and analytics solutions for international client teams seeking new insights or data-driven capabilities. His favourite current engagement involves creating predictive models and quantifying the importance of user connections for a popular social network.
After years spent working with data, John is largely unable to stop asking questions. In his own time, he routinely builds ML solutions in Python to fulfil a broad set of personal interests. These include a novel variant on the StyleNet computational creativity algorithm and solutions for algo-trading and geolocation-based recommendation. He currently lives in the UK.

About the Reviewers

Jared Huffman is a lifelong gamer and extreme data geek. After completing his bachelor's degree in computer science, he started his career in his hometown of Melbourne, Florida. While there, he honed his software development skills, including work on a credit card-processing system and a variety of web tools. He finished it off with a fun contract working at NASA's Kennedy Space Center before migrating to his current home in the Seattle area.
Diving head first into the world of data, he took up a role working on Microsoft's internal finance tools and reporting systems. Feeling that he could no longer resist his love for video games, he joined the Xbox division to build their Business. To date, Jared has helped ship and support 12 games and presented at several events on various machine learning and other data topics. His latest endeavor has him applying both his software skills and analytics expertise in leading the data science efforts for Minecraft. There he gets to apply machine learning techniques, trying out fun and impactful projects, such as customer segmentation models, churn prediction, and recommendation systems.
Outside of work, Jared spends much of his free time playing board games and video games with his family and friends, as well as dabbling in occasional game development.
Ashwin Pajankar is a software professional and IoT enthusiast with more than 8 years of experience in software design, development, testing, and automation.
He graduated from IIIT Hyderabad, earning an M. Tech in computer science and engineering. He holds multiple professional certifications from Oracle, IBM, Teradata, and ISTQB in development, databases, and testing. He has won several awards in college through outreach initiatives, at work for technical achievements, and community service through corporate social responsibility programs.
He was introduced to Raspberry Pi while organizing a hackathon at his workplace, and has been hooked on Pi ever since. He writes plenty of code in C, Bash, Python, and Java o...

Table of contents