A Tour of Data Science
eBook - ePub

A Tour of Data Science

Learn R and Python in Parallel

  1. 206 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

A Tour of Data Science

Learn R and Python in Parallel

About this book

A Tour of Data Science: Learn R and Python in Parallel covers the fundamentals of data science, including programming, statistics, optimization, and machine learning in a single short book. It does not cover everything, but rather, teaches the key concepts and topics in Data Science. It also covers two of the most popular programming languages used in Data Science, R and Python, in one source.

Key features:

  • Allows you to learn R and Python in parallel
  • Cover statistics, programming, optimization and predictive modelling, and the popular data manipulation tools – data.table and pandas
  • Provides a concise and accessible presentation
  • Includes machine learning algorithms implemented from scratch, linear regression, lasso, ridge, logistic regression, gradient boosting trees, etc.

Appealing to data scientists, statisticians, quantitative analysts, and others who want to learn programming with R and Python from a data science perspective.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access A Tour of Data Science by Nailong Zhang in PDF and/or ePUB format, as well as other popular books in Computer Science & Statistics for Business & Economics. We have over one million books available in our catalogue for you to explore.

CHAPTER 1

Introduction to R/Python Programming

In this chapter, I will give an introduction to general R and Python programming in a parallel fashion.

1.1 CALCULATOR

R and Python are general-purpose programming languages that can be used for writing softwares in a variety of domains. But for now, let us start with using them as basic calculators. The first thing is to have them installed. R1 and Python2 can be downloaded from their official website. In this book, I will be using R 3.5 and Python 3.7.
To use R/Python as basic calculators, let’s get familiar with the interactive mode. After the installation, we can type R or Python (it is case insensitive so we can also type r/python) to invoke the interactive mode. Since Python 2 is installed by default on many machines, in order to avoid invoking Python 2 we type python3.7 instead.
R
 1 2 ∼ $R 1 4 R version 3.5.1 (2018–07–02) — ā€œFeather Sp\footnoterayā€ 5 Copyright (C) 2018 The R Foundation for Statistical Computing 6 Platform: x86_64—apple—darwin15.6.0 (64—bit) 7 8 R is free software and comes with ABSOLUTELY NO WARRANTY. 9 You are welcome to redistribute it under certain conditions. 10 Type ’license()’ or ’licence()’ for distribution details. 11 
_______________
1 https://www.r-project.org
2 https://www.python.org
 12 Natural language support but running in an English locale 13 14 R is a collaborative project with many contributors. 15 Type ’contributors()’ for more information and 16 ’citation()’ on how to cite R or R packages in publications. 17 18 Type ’demo()’ for some demos, ’help()’ for on—line help, or 19 ’help. start()’ for an HTML browser interface to help. 20 Type ’q ()’ to quit R. 21 22 > 
Python
 1 ∼ $python 3.7 2 Python 3.7.1 (default, Nov 6 2018, 18:45:35) 3 [Clang 10.0.0 (clang–1000.11.45.5)] on darwin 4 Type ā€œhelpā€, ā€œcopyrightā€, ā€œcreditsā€ or ā€œlicenseā€ for more information. 5 >>> 
The messages displayed by invoking the interactive mode depend on both the version of R/Python installed and the machine. Thus, you may see different messages on your local machine. As the messages said, to quit R we can type q(). There are 3 options prompted by asking the user if the workspace should be saved or not. Since we just want to use R as a basic calculator, we quit without saving workspace.
To quit Python, we can simply type exit().
R
 1 > q() 2 Save workspace image? [y/n/c]: n 3 ∼ $ 
Once we are inside the interactive mode, we can use R/Python as a calculator.
R
 1 > 1+1 2 [1] 2 3 > 2 * 3 + 5 4 [1] 11 5 > log (2) 6 [1] 0.6931472 7 > exp (0) 8 [1] 1 
Python
 1 >>> 1+1 2 2 3 >>> 2*3+5 4 11 5 >>> log (2) 6 Traceback (most recent call last): 7 File ā€œ<stdin>ā€, line 1, in <module> 8 NameError: name ’log’ is not defined 9 >>> exp (0) 10 Traceback (most recent call last): 11 File ā€œ<stdin>ā€, line 1, in <module= 12 NameError: name ’exp’ is not defined 
From the code snippet above, R is working as a calculator perfectly. However, errors are raised when we call log(2) and exp(2) in Python. The error messages are self-explanatory - log function and exp function don’t exist in the current Pyth...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Preface
  8. Chapter 1 ā–  Introduction to R/Python Programming
  9. Chapter 2 ā–  More on R/Python Programming
  10. Chapter 3 ā–  data.table and pandas
  11. Chapter 4 ā–  Random Variables, Distributions & Linear Regression
  12. Chapter 5 ā–  Optimization in Practice
  13. Chapter 6 ā–  Machine Learning - A gentle introduction
  14. Bibliography
  15. Index