Python Data Analysis Cookbook
eBook - ePub

Python Data Analysis Cookbook

Ivan Idris

Partager le livre
  1. 462 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Python Data Analysis Cookbook

Ivan Idris

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Over 140 practical recipes to help you make sense of your data with ease and build production-ready data apps

About This Book

  • Analyze Big Data sets, create attractive visualizations, and manipulate and process various data types
  • Packed with rich recipes to help you learn and explore amazing algorithms for statistics and machine learning
  • Authored by Ivan Idris, expert in python programming and proud author of eight highly reviewed books

Who This Book Is For

This book teaches Python data analysis at an intermediate level with the goal of transforming you from journeyman to master. Basic Python and data analysis skills and affinity are assumed.

What You Will Learn

  • Set up reproducible data analysis
  • Clean and transform data
  • Apply advanced statistical analysis
  • Create attractive data visualizations
  • Web scrape and work with databases, Hadoop, and Spark
  • Analyze images and time series data
  • Mine text and analyze social networks
  • Use machine learning and evaluate the results
  • Take advantage of parallelism and concurrency

In Detail

Data analysis is a rapidly evolving field and Python is a multi-paradigm programming language suitable for object-oriented application development and functional design patterns. As Python offers a range of tools and libraries for all purposes, it has slowly evolved as the primary language for data science, including topics on: data analysis, visualization, and machine learning.

Python Data Analysis Cookbook focuses on reproducibility and creating production-ready systems. You will start with recipes that set the foundation for data analysis with libraries such as matplotlib, NumPy, and pandas. You will learn to create visualizations by choosing color maps and palettes then dive into statistical data analysis using distribution algorithms and correlations. You'll then help you find your way around different data and numerical problems, get to grips with Spark and HDFS, and then set up migration scripts for web mining.

In this book, you will dive deeper into recipes on spectral analysis, smoothing, and bootstrapping methods. Moving on, you will learn to rank stocks and check market efficiency, then work with metrics and clusters. You will achieve parallelism to improve system performance by using multiple threads and speeding up your code.

By the end of the book, you will be capable of handling various data analysis techniques in Python and devising solutions for problem scenarios.

Style and Approach

The book is written in "cookbook" style striving for high realism in data analysis. Through the recipe-based format, you can read each recipe separately as required and immediately apply the knowledge gained.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Python Data Analysis Cookbook est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Python Data Analysis Cookbook par Ivan Idris en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Data Processing. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2016
ISBN
9781785282287
Édition
1
Sous-sujet
Data Processing

Python Data Analysis Cookbook


Table of Contents

Python Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
Why do you need this book?
Data analysis, data science, big data – what is the big deal?
A brief of history of data analysis with Python
A conjecture about the future
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it

How it works

There's more

See also
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Laying the Foundation for Reproducible Data Analysis
Introduction
Setting up Anaconda
Getting ready
How to do it...
There's more...
See also
Installing the Data Science Toolbox
Getting ready
How to do it...
How it works...
See also
Creating a virtual environment with virtualenv and virtualenvwrapper
Getting ready
How to do it...
See also
Sandboxing Python applications with Docker images
Getting ready
How to do it...
How it works...
See also
Keeping track of package versions and history in IPython Notebook
Getting ready
How to do it...
How it works...
See also
Configuring IPython
Getting ready
How to do it...
See also
Learning to log for robust error checking
Getting ready
How to do it...
How it works...
See also
Unit testing your code
Getting ready
How to do it...
How it works...
See also
Configuring pandas
Getting ready
How to do it...
Configuring matplotlib
Getting ready
How to do it...
How it works...
See also
Seeding random number generators and NumPy print options
Getting ready
How to do it...
See also
Standardizing reports, code style, and data access
Getting ready
How to do it...
See also
2. Creating Attractive Data Visualizations
Introduction
Graphing Anscombe's quartet
How to do it...
See also
Choosing seaborn color palettes
How to do it...
See also
Choosing matplotlib color maps
How to do it...
See also
Interacting with IPython Notebook widgets
How to do it...
See also
Viewing a matrix of scatterplots
How to do it...
Visualizing with d3.js via mpld3
Getting ready
How to do it...
Creating heatmaps
Getting ready
How to do it...
See also
Combining box plots and kernel density plots with violin plots
How to do it...
See also
Visualizing network graphs with hive plots
Getting ready
How to do it...
Displaying geographical maps
Getting ready
How to do it...
Using ggplot2-like plots
Getting ready
How to do it...
Highlighting data points with influence plots
How to do it...
See also
3. Statistical Data Analysis and Probability
Introduction
Fitting data to the exponential distribution
How to do it...
How it works

See also
Fitting aggregated data to the gamma distribution
How to do it...
See also
Fitting aggregated counts to the Poisson distribution
How to do it...
See also
Determining bias
How to do it...
See also
Estimating kernel density
How to do it...
See also
Determining confidence intervals for mean, variance, and standard deviation
How to do it...
See also
Sampling with probability weights
How to do it...
See also
Exploring extreme values
How to do it...
See also
Correlating variables with Pearson's correlation
How to do it...
See also
Correlating variables with the Spearman rank correlation
How to do it...
See also
Correlating a binary and a continuous variable with the point biserial correlation
How to do it...
See also
Evaluating relations between variables with ANOVA
How to do it...
See also
4. Dealing with Data and Numerical Issues
Introduction
Clipping and filtering outliers
How to do it...
See also
Winsorizing data
How to do it...
See also
Measuring central tendency of noisy data
How to do it...
See also
Normalizing with the Box-Cox transformation
How to do it...
How it works
See also
Transforming data with the power ladder
How to do it...
Transforming data with logarithms
How to do it...
Rebinning data
How to do it...
Applying logit() to transform proportions
How to do it...
Fitting a robust linear model
How to do it...
See also
Taking variance into account with weighted least squares
How to do it...
See also
Using arbitrary precision for optimization
Getting ready
How to do it...
See also
Using arbitrary precision for linear algebra
Getting ready
How to do it...
See also
5. Web Mining, Databases, and Big Data
Introduction
Simulating web browsing
Getting ready
How to do it

See also
Scraping the Web
Getting ready
How to do it

Dealing with non-ASCII text and HTML entities
Getting ready
How to do it

See also
Implementing association tables
Getting ready
How to do it

Setting up database migration scripts
Getting ready
How to do it

See also
Adding a table column to an existing table
Getting ready
How to do it

Adding indices after table creation
Getting ready
How to do it

How it works

See also
Setting up a test web server
Getting ready
How to do it

Implementing a star schema wi...

Table des matiĂšres