Python Data Analysis Cookbook
eBook - ePub

Python Data Analysis Cookbook

  1. 462 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Python Data Analysis Cookbook

About this book

Over 140 practical recipes to help you make sense of your data with ease and build production-ready data apps

About This Book

  • Analyze Big Data sets, create attractive visualizations, and manipulate and process various data types
  • Packed with rich recipes to help you learn and explore amazing algorithms for statistics and machine learning
  • Authored by Ivan Idris, expert in python programming and proud author of eight highly reviewed books

Who This Book Is For

This book teaches Python data analysis at an intermediate level with the goal of transforming you from journeyman to master. Basic Python and data analysis skills and affinity are assumed.

What You Will Learn

  • Set up reproducible data analysis
  • Clean and transform data
  • Apply advanced statistical analysis
  • Create attractive data visualizations
  • Web scrape and work with databases, Hadoop, and Spark
  • Analyze images and time series data
  • Mine text and analyze social networks
  • Use machine learning and evaluate the results
  • Take advantage of parallelism and concurrency

In Detail

Data analysis is a rapidly evolving field and Python is a multi-paradigm programming language suitable for object-oriented application development and functional design patterns. As Python offers a range of tools and libraries for all purposes, it has slowly evolved as the primary language for data science, including topics on: data analysis, visualization, and machine learning.

Python Data Analysis Cookbook focuses on reproducibility and creating production-ready systems. You will start with recipes that set the foundation for data analysis with libraries such as matplotlib, NumPy, and pandas. You will learn to create visualizations by choosing color maps and palettes then dive into statistical data analysis using distribution algorithms and correlations. You'll then help you find your way around different data and numerical problems, get to grips with Spark and HDFS, and then set up migration scripts for web mining.

In this book, you will dive deeper into recipes on spectral analysis, smoothing, and bootstrapping methods. Moving on, you will learn to rank stocks and check market efficiency, then work with metrics and clusters. You will achieve parallelism to improve system performance by using multiple threads and speeding up your code.

By the end of the book, you will be capable of handling various data analysis techniques in Python and devising solutions for problem scenarios.

Style and Approach

The book is written in "cookbook" style striving for high realism in data analysis. Through the recipe-based format, you can read each recipe separately as required and immediately apply the knowledge gained.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Information

Python Data Analysis Cookbook


Table of Contents

Python Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
Why do you need this book?
Data analysis, data science, big data – what is the big deal?
A brief of history of data analysis with Python
A conjecture about the future
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Laying the Foundation for Reproducible Data Analysis
Introduction
Setting up Anaconda
Getting ready
How to do it...
There's more...
See also
Installing the Data Science Toolbox
Getting ready
How to do it...
How it works...
See also
Creating a virtual environment with virtualenv and virtualenvwrapper
Getting ready
How to do it...
See also
Sandboxing Python applications with Docker images
Getting ready
How to do it...
How it works...
See also
Keeping track of package versions and history in IPython Notebook
Getting ready
How to do it...
How it works...
See also
Configuring IPython
Getting ready
How to do it...
See also
Learning to log for robust error checking
Getting ready
How to do it...
How it works...
See also
Unit testing your code
Getting ready
How to do it...
How it works...
See also
Configuring pandas
Getting ready
How to do it...
Configuring matplotlib
Getting ready
How to do it...
How it works...
See also
Seeding random number generators and NumPy print options
Getting ready
How to do it...
See also
Standardizing reports, code style, and data access
Getting ready
How to do it...
See also
2. Creating Attractive Data Visualizations
Introduction
Graphing Anscombe's quartet
How to do it...
See also
Choosing seaborn color palettes
How to do it...
See also
Choosing matplotlib color maps
How to do it...
See also
Interacting with IPython Notebook widgets
How to do it...
See also
Viewing a matrix of scatterplots
How to do it...
Visualizing with d3.js via mpld3
Getting ready
How to do it...
Creating heatmaps
Getting ready
How to do it...
See also
Combining box plots and kernel density plots with violin plots
How to do it...
See also
Visualizing network graphs with hive plots
Getting ready
How to do it...
Displaying geographical maps
Getting ready
How to do it...
Using ggplot2-like plots
Getting ready
How to do it...
Highlighting data points with influence plots
How to do it...
See also
3. Statistical Data Analysis and Probability
Introduction
Fitting data to the exponential distribution
How to do it...
How it works…
See also
Fitting aggregated data to the gamma distribution
How to do it...
See also
Fitting aggregated counts to the Poisson distribution
How to do it...
See also
Determining bias
How to do it...
See also
Estimating kernel density
How to do it...
See also
Determining confidence intervals for mean, variance, and standard deviation
How to do it...
See also
Sampling with probability weights
How to do it...
See also
Exploring extreme values
How to do it...
See also
Correlating variables with Pearson's correlation
How to do it...
See also
Correlating variables with the Spearman rank correlation
How to do it...
See also
Correlating a binary and a continuous variable with the point biserial correlation
How to do it...
See also
Evaluating relations between variables with ANOVA
How to do it...
See also
4. Dealing with Data and Numerical Issues
Introduction
Clipping and filtering outliers
How to do it...
See also
Winsorizing data
How to do it...
See also
Measuring central tendency of noisy data
How to do it...
See also
Normalizing with the Box-Cox transformation
How to do it...
How it works
See also
Transforming data with the power ladder
How to do it...
Transforming data with logarithms
How to do it...
Rebinning data
How to do it...
Applying logit() to transform proportions
How to do it...
Fitting a robust linear model
How to do it...
See also
Taking variance into account with weighted least squares
How to do it...
See also
Using arbitrary precision for optimization
Getting ready
How to do it...
See also
Using arbitrary precision for linear algebra
Getting ready
How to do it...
See also
5. Web Mining, Databases, and Big Data
Introduction
Simulating web browsing
Getting ready
How to do it…
See also
Scraping the Web
Getting ready
How to do it…
Dealing with non-ASCII text and HTML entities
Getting ready
How to do it…
See also
Implementing association tables
Getting ready
How to do it…
Setting up database migration scripts
Getting ready
How to do it…
See also
Adding a table column to an existing table
Getting ready
How to do it…
Adding indices after table creation
Getting ready
How to do it…
How it works…
See also
Setting up a test web server
Getting ready
How to do it…
Implementing a star schema wi...

Table of contents

  1. Python Data Analysis Cookbook

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Python Data Analysis Cookbook by Ivan Idris in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Tratamiento de datos. We have over one million books available in our catalogue for you to explore.