Python for R Users
eBook - ePub

Python for R Users

A Data Science Approach

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Python for R Users

A Data Science Approach

About this book

The definitive guide for statisticians and data scientists who understand the advantages of becoming proficient in both R and Python

The first book of its kind, Python for R Users: A Data Science Approach makes it easy for R programmers to code in Python and Python users to program in R. Short on theory and long on actionable analytics, it provides readers with a detailed comparative introduction and overview of both languages and features concise tutorials with command-by-command translations—complete with sample code—of R to Python and Python to R.

Following an introduction to both languages, the author cuts to the chase with step-by-step coverage of the full range of pertinent programming features and functions, including data input, data inspection/data quality, data analysis, and data visualization. Statistical modeling, machine learning, and data mining—including supervised and unsupervised data mining methods—are treated in detail, as are time series forecasting, text mining, and natural language processing.

• Features a quick-learning format with concise tutorials and actionable analytics

• Provides command-by-command translations of R to Python and vice versa

• Incorporates Python and R code throughout to make it easier for readers to compare and contrast features in both languages

• Offers numerous comparative examples and applications in both programming languages

• Designed for use for practitioners and students that know one language and want to learn the other

• Supplies slides useful for teaching and learning either software on a companion website

Python for R Users: A Data Science Approach is a valuable working resource for computer scientists and data scientists that know R and would like to learn Python or are familiar with Python and want to learn R. It also functions as textbook for students of computer science and statistics.

A. Ohri is the founder of Decisionstats.com and currently works as a senior data scientist. He has advised multiple startups in analytics off-shoring, analytics services, and analytics education, as well as using social media to enhance buzz for analytics products. Mr. Ohri's research interests include spreading open source analytics, analyzing social media manipulation with mechanism design, simpler interfaces for cloud computing, investigating climate change and knowledge flows. His other books include R for Business Analytics and R for Cloud Computing.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Information

Publisher
Wiley
Year
2017
Print ISBN
9781119126768
eBook ISBN
9781119126782

1
Introduction to Python R and Data Science

1.1 What Is Python?

Python is a programming language that lets you work more quickly and integrate your systems more effectively. It was created by Guido van Rossum. You can read Guido’s history of Python at the History of Python blog at http://python‐history.blogspot.in/2009/01/introduction‐and‐overview.html.
It is worth reading for beginners and even experienced people in Python. The following is just an extract:
many of Python’s keywords (if, else, while, for, etc.) are the same as in C, Python identifiers have the same naming rules as C, and most of the standard operators have the same meaning as C. Of course, Python is obviously not C and one major area where it differs is that instead of using braces for statement grouping, it uses indentation. For example, instead of writing statements in C like this
if (a < b) {  max = b; } else {  max = a; }
Python just dispenses with the braces altogether (along with the trailing semicolons for good measure) and uses the following structure:
if a < b:  max = b else:  max = a
The other major area where Python differs from C‐like languages is in its use of dynamic typing. In C, variables must always be explicitly declared and given a specific type such as int or double. This information is then used to perform static compile‐time checks of the program as well as for allocating memory locations used for storing the variable’s value. In Python, variables are simply names that refer to objects.
The Python Package Index (PyPI) https://pypi.python.org/pypi hosts third‐party modules for Python. There are currently 91 625 packages there. You can browse Python packages by topic at https://pypi.python.org/pypi?%3Aaction=browse

1.2 What Is R?

The official definition of what is R is given on the main website at http://www.r‐project.org/about.html
R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either on‐screen or on hardcopy, and a well‐developed, simple and effective programming language which includes conditionals, loops, user‐defined recursive functions and input and output facilities.
The term ā€˜environment’ is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.
The Comprehensive R Archive Network (CRAN) hosts thousands of packages for R at https://cran.r‐project.org/web/packages/, so does GitHub (see https://github.com/search?utf8=%E2%9C%93&q=stars%3A%3E1+language%3AR) as well as Bioconductor as package repositories. You can see all the packages from these repositories for R at http://www.rdocumentation.org/ (11 885 packages as of 2016).
As per the author, R is both a language in statistics as well as computer science and an analytics software with great usefulness in analyzing business data and applying data science to it. In particular the appeal of R remains: it is a free open source and has a huge number of packages particularly dealing with analysis of data.
Disadvantages of R remain memory handling in production environments, lack of incentives for R developers, and a sometimes turgid documentation that is mildly academic oriented rather than enterprise user oriented.

1.3 What Is Data Science?

Data science lies at the intersection of programming, statistics, and business analysis. It is the use of programming tools with...

Table of contents

  1. Cover
  2. Title Page
  3. Table of Contents
  4. Preface
  5. Acknowledgments
  6. Scope
  7. Purpose
  8. Plan
  9. The Zen of Python
  10. 1 Introduction to Python R and Data Science
  11. 2 Data Input
  12. 3 Data Inspection and Data Quality
  13. 4 Exploratory Data Analysis
  14. 5 Statistical Modeling
  15. 6 Data Visualization
  16. 7 Machine Learning Made Easier
  17. 8 Conclusion and Summary
  18. Index
  19. End User License Agreement

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Python for R Users by Ajay Ohri in PDF and/or ePUB format, as well as other popular books in Informatik & Programmierung in Python. We have over one million books available in our catalogue for you to explore.