Python Data Science Essentials - Second Edition
eBook - ePub

Python Data Science Essentials - Second Edition

  1. 378 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Python Data Science Essentials - Second Edition

About this book

Become an efficient data science practitioner by understanding Python's key concepts

About This Book

  • Quickly get familiar with data science using Python 3.5
  • Save time (and effort) with all the essential tools explained
  • Create effective data science projects and avoid common pitfalls with the help of examples and hints dictated by experience

Who This Book Is For

If you are an aspiring data scientist and you have at least a working knowledge of data analysis and Python, this book will get you started in data science. Data analysts with experience of R or MATLAB will also find the book to be a comprehensive reference to enhance their data manipulation and machine learning skills.

What You Will Learn

  • Set up your data science toolbox using a Python scientific environment on Windows, Mac, and Linux
  • Get data ready for your data science project
  • Manipulate, fix, and explore data in order to solve data science problems
  • Set up an experimental pipeline to test your data science hypotheses
  • Choose the most effective and scalable learning algorithm for your data science tasks
  • Optimize your machine learning models to get the best performance
  • Explore and cluster graphs, taking advantage of interconnections and links in your data

In Detail

Fully expanded and upgraded, the second edition of Python Data Science Essentials takes you through all you need to know to suceed in data science using Python. Get modern insight into the core of Python data, including the latest versions of Jupyter notebooks, NumPy, pandas and scikit-learn. Look beyond the fundamentals with beautiful data visualizations with Seaborn and ggplot, web development with Bottle, and even the new frontiers of deep learning with Theano and TensorFlow.

Dive into building your essential Python 3.5 data science toolbox, using a single-source approach that will allow to to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and all the techniques you need to load, analyse, and process your data. Finally, get a complete overview of principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users.

Style and approach

The book is structured as a data science project. You will always benefit from clear code and simplified examples to help you understand the underlying mechanics and real-world datasets.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Python Data Science Essentials - Second Edition by Alberto Boschetti, Luca Massaron in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Modelling & Design. We have over one million books available in our catalogue for you to explore.

Python Data Science Essentials - Second Edition


Python Data Science Essentials - Second Edition

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: April 2015
Second edition: October 2016
Production reference: 1211016
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-213-8
www.packtpub.com

Credits

Authors
Alberto Boschetti
Luca Massaron
Copy Editor
Vikrant Phadke
Reviewer
Zacharias Voulgaris
Project Coordinator
Nidhi Joshi
Commissioning Editor
Veena Pagare
Proofreader
Safis Editing
Acquisition Editor
Namrata Patil
Indexer
Aishwarya Gangawane
Content Development Editor
Mayur Pawanikar
Graphics
Disha Haria
Technical Editor
Vivek Arora
Production Coordinator
Arvindkumar Gupta

About the Authors

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a PhD in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP), behavioral analysis, and machine learning to distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
I would like to thank my family, my friends, and my colleagues. Also, a big thanks to the open source community.
Luca Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight, with over a decade of experience of solving real-world problems and generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of web audience analysis in Italy to achieving the rank of a top ten Kaggler, he has always been very passionate about every aspect of data and its analysis, and also about demonstrating the potential of data-driven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, Luca believes that a lot can be achieved in data science just by doing the essentials.
To Yukiko and Amelia, for their loving patience. "Roads go ever ever on, under cloud and under star, yet feet that wandering have gone turn at last to home afar".

About the Reviewer

Zacharias Voulgaris is a data scientist and technical author specializing in data science books. He has an engineering and management background, with post-graduate studies in information systems and machine learning. Zacharias has worked as a research fellow at Georgia Tech, investigating and applying machine learning technologies to real-world problems, as an SEO manager in an e-marketing company in Europe, as a program manager in Microsoft, and as a data scientist at US Bank and at G2 Web Services.
Dr. Voulgaris has also authored technical books, the most notable of which is Data Scientist - the definitive guide to becoming a data scientist (Technics Publications), and his newest book, Julia for Data Science (Technics Publications), was released during the summer of 2016. He has also written a number of data-science-related articles on blogs and participates in various data science/machine learning meetup groups. Finally, he has provided technical editorial aid in the book Python Data Science Essentials (Packt), by the same authors as this book.
I would very much like to express my gratitude to the authors of the book for giving me the opportunity to contribute to this project. Also, I'd like to thank Bastiaan Sjardin for introducing me to them and to the world of technical editing. It's been a privilege working with all of you.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
www.PacktPub.com
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

  • Fully searchable across every book published by Packt
  • Copy and paste, print, and bookmark content
  • On demand and accessible via a web browser

Preface

"A journey of a thousand miles begins with a single step."
--Laozi (604 BC - 531 BC)
Data science is a relatively new knowledge domain that requires the successful integration of linear algebra, statistical modeling, visualization, computational linguistics, graph analysis, machine learning, business intelligence, and data storage and retrieval.
The Python programming language, having conquered the scientific community during the last decade, is now an indispensable tool for the data science practitioner and a must-have tool for every aspiring data scientist. Python will offer you a fast, reliable, cross-platform, mature environment for data analysis, machine learning, and algorithmic problem solving. Whatever stopped you before from mastering Python for data science applications will be easily overcome by our easy, step-by-step, and example-oriented approach that will help you apply the most straightforward and effective Python tools to both demonstrative and real-world datasets. As the second edition of Python Data Science Essentials, this book offers updated and expanded content. Based on the recent Jupyter Notebooks (incorporating interchangeable kernels, a truly polyglot data science system), this book incorporates all the main recent improvements in Numpy, Pandas, and Scikit-learn. Additionally, it offers new content in the form of deep learning (by presenting Keras–based on both Theano and Tensorflow), beautiful visualizations (seaborn and ggplot), ...

Table of contents

  1. Python Data Science Essentials - Second Edition