
Become a Python Data Analyst
Perform exploratory data analysis and gain insight into scientific computing using Python
- 178 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Become a Python Data Analyst
Perform exploratory data analysis and gain insight into scientific computing using Python
About this book
Enhance your data analysis and predictive modeling skills using popular Python tools
Key Features
- Cover all fundamental libraries for operation and manipulation of Python for data analysis
- Implement real-world datasets to perform predictive analytics with Python
- Access modern data analysis techniques and detailed code with scikit-learn and SciPy
Book Description
Python is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations.
Become a Python Data Analyst introduces Python's most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations.
In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques.
By the end of this book, you will have hands-on experience performing data analysis with Python.
What you will learn
- Explore important Python libraries and learn to install Anaconda distribution
- Understand the basics of NumPy
- Produce informative and useful visualizations for analyzing data
- Perform common statistical calculations
- Build predictive models and understand the principles of predictive analytics
Who this book is for
Become a Python Data Analyst is for entry-level data analysts, data engineers, and BI professionals who want to make complete use of Python tools for performing efficient data analysis. Prior knowledge of Python programming is necessary to understand the concepts covered in this book
Trusted by 375,005 students
Access to over 1 million titles for a fair monthly price.
Study more efficiently using our study tools.
Information
Visualization and Exploratory Data Analysis
- Introducing matplotlib
- Introducing pyplot
- Object-oriented interfaces
- Common customizations
- Exploratory data analysis with seaborn and pandas
- Analyzing the variables individually
- The relationship between variables
Introducing Matplotlib


Terminologies in Matplotlib

- Figure: The figure is the first top-level container in this hierarchy. It is the overall window that contains everything that is drawn. We can have multiple independent figures and multiple axes in the figure.
- Axes/Subplot: Now, most of the plotting is done with respect to one axis or subplot. This plot has a lot of components to it, such as the x axis and the y axis; we have a plotting area, we have tick marks, and so on. As part of the subplot, we have other objects such as the x axis and, within the x axis, we have things such as the x label, the x tick marks, and the labels for the tick marks. This is basically the hierarchy that we have in matplotlib.
- Axis: We can see that the top of the hierarchy has the figure and, inside the figure, we have subplots. But the preceding image has only one subplot, but otherwise, we can have many subplots inside a figure. Every subplot has other elements; most commonly, we will have an x axis, a y axis, and many other elements.
Introduction to pyplot
%matplotlib inline
- The first convention to import matplotlib into the current session is matplotlib.pyplot as plt:
import matplotlib as plt
- Our first command includes the plot function from the plt module and pyplot module, and we will also pass a list of numbers. So, when we execute the line, we will see that the command creates a figure. In the following diagram, we have a figure even though we cannot see it, inside the figure we have a subplot, and inside this subplot we have a line plot that is just a graphical representation of the numbers we have in this list:


- The most commonly-used function in the pyplot interface is the plot function, which can take many arguments. For instance, if we pass two lists of ...
Table of contents
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributor
- Preface
- The Anaconda Distribution and Jupyter Notebook
- Vectorizing Operations with NumPy
- Pandas - Everyone's Favorite Data Analysis Library
- Visualization and Exploratory Data Analysis
- Statistical Computing with Python
- Introduction to Predictive Analytics Models
- Other Books You May Enjoy
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app