Practical Data Science with Python
eBook - ePub

Practical Data Science with Python

Learn tools and techniques from hands-on examples to extract insights from data

Nathan George

Condividi libro
  1. 620 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Practical Data Science with Python

Learn tools and techniques from hands-on examples to extract insights from data

Nathan George

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Learn to effectively manage data and execute data science projects from start to finish using Python

Key Features

  • Understand and utilize data science tools in Python, such as specialized machine learning algorithms and statistical modeling
  • Build a strong data science foundation with the best data science tools available in Python
  • Add value to yourself, your organization, and society by extracting actionable insights from raw data

Book Description

Practical Data Science with Python teaches you core data science concepts, with real-world and realistic examples, and strengthens your grip on the basic as well as advanced principles of data preparation and storage, statistics, probability theory, machine learning, and Python programming, helping you build a solid foundation to gain proficiency in data science.

The book starts with an overview of basic Python skills and then introduces foundational data science techniques, followed by a thorough explanation of the Python code needed to execute the techniques. You'll understand the code by working through the examples. The code has been broken down into small chunks (a few lines or a function at a time) to enable thorough discussion.

As you progress, you will learn how to perform data analysis while exploring the functionalities of key data science Python packages, including pandas, SciPy, and scikit-learn. Finally, the book covers ethics and privacy concerns in data science and suggests resources for improving data science skills, as well as ways to stay up to date on new data science developments.

By the end of the book, you should be able to comfortably use Python for basic data science projects and should have the skills to execute the data science process on any data source.

What you will learn

  • Use Python data science packages effectively
  • Clean and prepare data for data science work, including feature engineering and feature selection
  • Data modeling, including classic statistical models (such as t-tests), and essential machine learning algorithms, such as random forests and boosted models
  • Evaluate model performance
  • Compare and understand different machine learning methods
  • Interact with Excel spreadsheets through Python
  • Create automated data science reports through Python
  • Get to grips with text analytics techniques

Who this book is for

The book is intended for beginners, including students starting or about to start a data science, analytics, or related program (e.g. Bachelor's, Master's, bootcamp, online courses), recent college graduates who want to learn new skills to set them apart in the job market, professionals who want to learn hands-on data science techniques in Python, and those who want to shift their career to data science.

The book requires basic familiarity with Python. A "getting started with Python" section has been included to get complete novices up to speed.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Practical Data Science with Python è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Practical Data Science with Python di Nathan George in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Computer Science e Data Modelling & Design. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Anno
2021
ISBN
9781801076654

2

Getting Started with Python

As we already discovered in Chapter 1, Introduction to Data Science, Python is the most commonly used language for data science, and so we will be using it exclusively in this book. In this chapter, we'll go through a crash course in Python. This should get you up to speed with the basics, although to learn Python in more depth, you should seek more resources. For example, Fabrizio Roman's Learning Python from Packt may be a resource you might want to check out in order to learn Python more deeply.
In this chapter, we'll cover the following topics:
  • Installing Python with a Python distribution (Anaconda)
  • Editing Python code with code text editors and Jupyter Notebooks
  • Running code with Jupyter Notebooks, IPython, and the command line
  • Installing Python packages and creating virtual environments
  • The basics of Python programming, including strings, numbers, loops, data structures, functions, and classes
  • Debugging errors and using documentation
  • Software engineering best practices, such as Git for version control
Let's get started with installing Python!

Installing Python with Anaconda and getting started

There are several ways to install Python, but the one we will use here is the Anaconda Python distribution. A distribution is a way of installing Python along with several Python packages/libraries, and possibly some other software. This saves us some time when installing and can give us additional functionalities, such as the ability to easily install complex packages with software dependencies. If you are unable to install Anaconda for whatever reason (for example, system administrative permission restrictions), you can try to instead install Python from other sources such as the official Python website (www.python.org/downloads/) or from the Microsoft store. In that case, you will need to exclusively use the pip package manager, and not conda.

Installing Anaconda

Our reasons for using Anaconda are severalfold. For one, Anaconda is widely used in the Python community, meaning the network effects are strong. This means a large community is available to help us with problems (for example, through Stack Overflow). It also means more people are contributing to the project. Another advantage of Anaconda is that it makes installing Python packages with complex dependencies much easier. For example, neural network packages such as TensorFlow and PyTorch require CUDA and cuDNN software to be installed, and H2O (a machine learning and AI software package) requires Java to be installed properly. Anaconda takes care of these dependencies for us when it installs these packages, saving us huge headaches and time. Anaconda comes with a GUI (Anaconda Navigator) and some other bells and whistles. It also allows us to create virtual environments with different versions of Python, which we will get to soon.
Installing Anaconda should be relatively easy. We simply query an internet search engine for "download Anaconda" and install it with the installer (currently, the download page is located at www.anaconda.com/products/individual). When installing Anaconda on Mac, there shouldn't be any options that change things drastically – going with the defaults should be fine. On Linux, be sure to select yes when asked Do you wish the installer to initialize Anaconda3 by running conda init?. The recommended settings from Anaconda's documentation should work well for installation (docs.anaconda.com/anaconda/install/). For Windows, I usually check the box for Add Anaconda3 to my PATH environment variable, even though this is not recommended. This will allow us to run Python and conda from any terminal or shell on our system.
You could also manually add conda and Anaconda Python to your PATH environment variable, but checking the box upon installation is easier (even though Anaconda doesn't recommend doing it). In my experience, I haven't had problems when checking the Add to PATH box on Windows Anaconda installations.
Once Anaconda is installed, you should be able to open a terminal or Command Prompt and run the command python to get to a basic Python shell, which we will cover in the next section. Now on to the next step – actually running Python code!

Running Python code

We will cover several options for running code here: the base Python shell, IPython, and Jupyter Notebooks. Some text editors and IDEs also allow us to run Python code from within the editor or IDE, although we will not cover that here.

The Python shell

There are several ways to run Python code, but let's start with the simplest – running code through a simple Python shell. Python is what's called an "interpreted" language, meaning code can be run on-the-fly (it's not converted into machine code). Compiling code means translating the human-readable code to machine code, which is a string of 1s and 0s that are given as instructions to a CPU. Interpreting code means running it by translating Python code on-the-fly to instructions the computer can run more directly. Compiled code usually runs faster than interpreted code, but we have the extra steps of compiling the program and then running it. This means we cannot run code interactively one bit at a time. So, interpreted code has the advantage of being able to run code interactively and one line at a time, while compiled code typically runs faster.
To try out Python's interpreted code execution, we should first open a terminal on Mac or Linux, or an Anaconda PowerShell Prompt from the Start menu on Windows (PowerShell has more commands available than a plain Command Prompt on Windows). With our command line ready, we then simply type python, et voilà! We have access to the Python shell. You can try some basic commands, such as 2 + 2 and print('hello').
This...

Indice dei contenuti