R Data Science Essentials
eBook - ePub

R Data Science Essentials

  1. 154 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

R Data Science Essentials

About this book

Learn the essence of data science and visualization using R in no time at all

About This Book

  • Become a pro at making stunning visualizations and dashboards quickly and without hassle
  • For better decision making in business, apply the R programming language with the help of useful statistical techniques.
  • From seasoned authors comes a book that offers you a plethora of fast-paced techniques to detect and analyze data patterns

Who This Book Is For

If you are an aspiring data scientist or analyst who has a basic understanding of data science and has basic hands-on experience in R or any other analytics tool, then R Data Science Essentials is the book for you.

What You Will Learn

  • Perform data preprocessing and basic operations on data
  • Implement visual and non-visual implementation data exploration techniques
  • Mine patterns from data using affinity and sequential analysis
  • Use different clustering algorithms and visualize them
  • Implement logistic and linear regression and find out how to evaluate and improve the performance of an algorithm
  • Extract patterns through visualization and build a forecasting algorithm
  • Build a recommendation engine using different collaborative filtering algorithms
  • Make a stunning visualization and dashboard using ggplot and R shiny

In Detail

With organizations increasingly embedding data science across their enterprise and with management becoming more data-driven it is an urgent requirement for analysts and managers to understand the key concept of data science. The data science concepts discussed in this book will help you make key decisions and solve the complex problems you will inevitably face in this new world.

R Data Science Essentials will introduce you to various important concepts in the field of data science using R. We start by reading data from multiple sources, then move on to processing the data, extracting hidden patterns, building predictive and forecasting models, building a recommendation engine, and communicating to the user through stunning visualizations and dashboards.

By the end of this book, you will have an understanding of some very important techniques in data science, be able to implement them using R, understand and interpret the outcomes, and know how they helps businesses make a decision.

Style and approach

This easy-to-follow guide contains hands-on examples of the concepts of data science using R.

Tools to learn more effectively

Saving Books

Saving Books

Keyword Search

Keyword Search

Annotating Text

Annotating Text

Listen to it instead

Listen to it instead

Information

R Data Science Essentials


Table of Contents

R Data Science Essentials
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started with R
Reading data from different sources
Reading data from a database
Data types in R
Variable data types
Data preprocessing techniques
Performing data operations
Arithmetic operations on the data
String operations on the data
Aggregation operations on the data
Mean
Median
Sum
Maximum and minimum
Standard deviation
Control structures in R
Control structures – if and else
Control structures – for
Control structures – while
Control structures – repeat and break
Control structures – next and return
Bringing data to a usable format
Summary
2. Exploratory Data Analysis
The Titanic dataset
Descriptive statistics
Box plot
Exercise
Inferential statistics
Univariate analysis
Bivariate analysis
Multivariate analysis
Cross-tabulation analysis
Graphical analysis
Summary
3. Pattern Discovery
Transactional datasets
Using the built-in dataset
Building the dataset
Apriori analysis
Support, confidence, and lift
Support
Confidence
Lift
Generating filtering rules
Plotting
Dataset
Rules
Sequential dataset
Apriori sequence analysis
Understanding the results
Reference
Business cases
Summary
4. Segmentation Using Clustering
Datasets
Reading and formatting the dataset in R
Centroid-based clustering and an ideal number of clusters
Implementation using K-means
Visualizing the clusters
Connectivity-based clustering
Visualizing the connectivity
Business use cases
Summary
5. Developing Regression Models
Datasets
Sampling the dataset
Logistic regression
Evaluating logistic regression
Linear regression
Evaluating linear regression
Methods to improve the accuracy
Ensemble models
Replacing NA with mean or median
Removing the highly correlated values
Removing outliers
Summary
6. Time Series Forecasting
Datasets
Extracting patterns
Forecasting using ARIMA
Forecasting using Holt-Winters
Methods to improve accuracy
Summary
7. Recommendation Engine
Dataset and transformation
Recommendations using user-based CF
Recommendations using item-based CF
Challenges and enhancements
Summary
8. Communicating Data Analysis
Dataset
Plotting using the googleVis package
Creating an interactive dashboard using Shiny
Summary
Index

R Data Science Essentials

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: January 2016
Production reference:1040116
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B32PB, UK.
ISBN 978-1-78528-654-4
www.packtpub.com

Credits

Authors
Raja B. Koushik
Sharan Kumar Ravindran
Reviewers
Jeremy Gray
Navin K Manaswi
Commissioning Editor
Dipika Gaonkar
Acquisition Editor
Manish Nainan
Content Development Editor
Mehvash Fatima
Technical Editor
Suwarna Patil
Copy Editor
Tasneem Fatehi
Project Coordinator
Shipra Chawhan
Proofreader
Safis Editing
Indexer
Mariammal Chettiyar
Graphics
Disha Haria
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta

About the Authors

Raja B. Koushik is a business intelligence professional with over 7 years of experience and is currently working in one of the leading international IT services companies. His primary interest lies for business intelligence technologies, such as ETL, reporting, and dashboarding, along with analytics based on statistics. He has worked with one of the world's largest companies for both their U.S. as well as UK business in the healthcare and leasing domains. He holds an engineering degree with specialization in information technology from Anna University.

Table of contents

  1. R Data Science Essentials

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access R Data Science Essentials by Raja B. Koushik, Sharan Kumar Ravindran in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Engineering. We have over one million books available in our catalogue for you to explore.