Python Data Science Essentials
eBook - ePub

Python Data Science Essentials

Alberto Boschetti, Luca Massaron

Condividi libro
  1. 258 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Python Data Science Essentials

Alberto Boschetti, Luca Massaron

Dettagli del libro
Anteprima del libro
Indice dei contenuti

Informazioni sul libro

About This Book

  • Quickly get familiar with data science using Python
  • Save tons of time through this reference book with all the essential tools illustrated and explained
  • Create effective data science projects and avoid common pitfalls with the help of examples and hints dictated by experience

Who This Book Is For

If you are an aspiring data scientist and you have at least a working knowledge of data analysis and Python, this book will get you started in data science. Data analysts with experience of R or MATLAB will also find the book to be a comprehensive reference to enhance their data manipulation and machine learning skills.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Python Data Science Essentials è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Python Data Science Essentials di Alberto Boschetti, Luca Massaron in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Informatik e Programmierung in Python. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.



Python Data Science Essentials

Table of Contents

Python Data Science Essentials
About the Authors
About the Reviewers
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
What this book covers
What you need for this book
Who this book is for
Reader feedback
Customer support
Downloading the example code
1. First Steps
Introducing data science and Python
Installing Python
Python 2 or Python 3?
Step-by-step installation
A glance at the essential Python packages
Beautiful Soup
The installation of packages
Package upgrades
Scientific distributions
Enthought Canopy
Introducing IPython
The IPython Notebook
Datasets and code used in the book
Scikit-learn toy datasets
The public repository
LIBSVM data examples
Loading data directly from CSV or text files
Scikit-learn sample generators
2. Data Munging
The data science process
Data loading and preprocessing with pandas
Fast and easy data loading
Dealing with problematic data
Dealing with big datasets
Accessing other data formats
Data preprocessing
Data selection
Working with categorical and textual data
A special type of data – text
Data processing with NumPy
NumPy's n-dimensional array
The basics of NumPy ndarray objects
Creating NumPy arrays
From lists to unidimensional arrays
Controlling the memory size
Heterogeneous lists
From lists to multidimensional arrays
Resizing arrays
Arrays derived from NumPy functions
Getting an array directly from a file
Extracting data from pandas
NumPy fast operation and computations
Matrix operations
Slicing and indexing with NumPy arrays
Stacking NumPy arrays
3. The Data Science Pipeline
Introducing EDA
Feature creation
Dimensionality reduction
The covariance matrix
Principal Component Analysis (PCA)
A variation of PCA for big data – RandomizedPCA
Latent Factor Analysis (LFA)
Linear Discriminant Analysis (LDA)
Latent Semantical Analysis (LSA)
Independent Component Analysis (ICA)
Kernel PCA
Restricted Boltzmann Machine (RBM)
The detection and treatment of outliers
Univariate outlier detection
Scoring functions
Multilabel classification
Binary classification
Testing and validating
Using cross-validation iterators
Sampling and bootstrapping
Hyper-parameters' optimization
Building custom scoring functions
Reducing the grid search runtime
Feature selection
Univariate selection
Recursive elimination
Stability and L1-based selection
4. Machine Learning
Linear and logistic regression
Naive Bayes
The k-Nearest Neighbors
Advanced nonlinear algorithms
SVM for classification
SVM for regression
Tuning SVM
Ensemble strategies
Pasting by random samples
Bagging with weak ensembles
Random Subspaces and Random Patches
Sequences of models – AdaBoost
Gradient tree boosting (GTB)
Dealing with big data
Creating some big datasets as examples
Scalability with volume
Keeping up with velocity
Dealing with variety
A quick overview of Stochastic Gradient Descent (SGD)
A peek into Natural Language Processing (NLP)
Word tokenization
Word Tagging
Named Entity Recognition (NER)
A complete data science example – text classification
An overview of unsupervised learning
5. Social Network Analysis
Introduction to graph theory
Graph algorithms
Graph loading, dumping, and sampling
6. Visualization
Introducing the basics of matplotlib
Curve plotting
Using panels
Bar graphs
Image visualization
Selected graphical examples with pandas
Boxplots and histograms
Parallel coordinates
Advanced data learning representation
Learning curves
Validation curves
Feature importance
GBT partial dependence plot

Python Data Science Essentials

Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: April 2015
Production reference: 1240415
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78528-042-9


Alberto Boschetti
Luca Massaron
Robert Dempsey
Daniel Frimer
Kevin Markham
Alberto Gonzalez Paje
Bastiaan Sjardin
Michele Usuelli
Zacharias Voulgaris, PhD
Commissioning Editor
Julian Ursell
Acquisition Editor
Subho Gupta
Content Development Editor
Merwyn D'souza
Technical Editor
Namrata Patil
Copy Editor
Vedangi Narvekar
Project Coordinator
Neha Bhatnagar
Simran Bhogal
Faye Coulman
Safis Editing
Dan McMahon
Priya Sane
Production Coordinator
Komal Ramchandani
Cover Work
Komal Ramchandani

About the Authors

Alberto Boschetti is a data scientist with expertise in signal processing and stat...

Indice dei contenuti