Big Data Analytics with R
eBook - ePub

Big Data Analytics with R

Simon Walkowiak

Buch teilen
  1. 506 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Big Data Analytics with R

Simon Walkowiak

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Utilize R to uncover hidden patterns in your Big Data

About This Book

  • Perform computational analyses on Big Data to generate meaningful results
  • Get a practical knowledge of R programming language while working on Big Data platforms like Hadoop, Spark, H2O and SQL/NoSQL databases,
  • Explore fast, streaming, and scalable data analysis with the most cutting-edge technologies in the market

Who This Book Is For

This book is intended for Data Analysts, Scientists, Data Engineers, Statisticians, Researchers, who want to integrate R with their current or future Big Data workflows.

It is assumed that readers have some experience in data analysis and understanding of data management and algorithmic processing of large quantities of data, however they may lack specific skills related to R.

What You Will Learn

  • Learn about current state of Big Data processing using R programming language and its powerful statistical capabilities
  • Deploy Big Data analytics platforms with selected Big Data tools supported by R in a cost-effective and time-saving manner
  • Apply the R language to real-world Big Data problems on a multi-node Hadoop cluster, e.g. electricity consumption across various socio-demographic indicators and bike share scheme usage
  • Explore the compatibility of R with Hadoop, Spark, SQL and NoSQL databases, and H2O platform

In Detail

Big Data analytics is the process of examining large and complex data sets that often exceed the computational capabilities. R is a leading programming language of data science, consisting of powerful functions to tackle all problems related to Big Data processing.

The book will begin with a brief introduction to the Big Data world and its current industry standards. With introduction to the R language and presenting its development, structure, applications in real world, and its shortcomings. Book will progress towards revision of major R functions for data management and transformations. Readers will be introduce to Cloud based Big Data solutions (e.g. Amazon EC2 instances and Amazon RDS, Microsoft Azure and its HDInsight clusters) and also provide guidance on R connectivity with relational and non-relational databases such as MongoDB and HBase etc. It will further expand to include Big Data tools such as Apache Hadoop ecosystem, HDFS and MapReduce frameworks. Also other R compatible tools such as Apache Spark, its machine learning library Spark MLlib, as well as H2O.

Style and approach

This book will serve as a practical guide to tackling Big Data problems using R programming language and its statistical environment. Each section of the book will present you with concise and easy-to-follow steps on how to process, transform and analyse large data sets.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Big Data Analytics with R als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Big Data Analytics with R von Simon Walkowiak im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Computer Science & Data Visualisation. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Jahr
2016
ISBN
9781786466457

Big Data Analytics with R


Big Data Analytics with R

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2016
Production reference: 1260716
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-645-7
www.packtpub.com

Credits

Authors
Simon Walkowiak
Cop y Editor
Safis Editing
Reviewer
Zacharias Voulgaris
Dipanjan Sarkar
Project Coordinator
Ulhas Kambali
Commissioning Editor
Akram Hussain
Proofreader
Safis Editing
Acquisition Editor
Sonali Vernekar
Indexer
Tejal Daruwale Soni
Content Development Editor
Onkar Wani
Graphics
Kirk D'Penha
Technical Editor
Sushant S Nadkar
Production Coordinator
Arvindkumar Gupta

About the Author

Simon Walkowiak is a cognitive neuroscientist and a managing director of Mind Project Ltd – a Big Data and Predictive Analytics consultancy based in London, United Kingdom. As a former data curator at the UK Data Service (UKDS, University of Essex) – European largest socio-economic data repository, Simon has an extensive experience in processing and managing large-scale datasets such as censuses, sensor and smart meter data, telecommunication data and well-known governmental and social surveys such as the British Social Attitudes survey, Labour Force surveys, Understanding Society, National Travel survey, and many other socio-economic datasets collected and deposited by Eurostat, World Bank, Office for National Statistics, Department of Transport, NatCen and International Energy Agency, to mention just a few. Simon has delivered numerous data science and R training courses at public institutions and international companies. He has also taught a course in Big Data Methods in R at major UK universities and at the prestigious Big Data and Analytics Summer School organized by the Institute of Analytics and Data Science (IADS).

Acknowledgement

The inspiration for writing this book came directly from the brilliant work and dedication of many R developers and users, whom I would like to thank first for creating a vibrant and highly-supportive community that nourishes the progress of publicly accessible data analytics and development of R language. However, this book would never be completed if I wasn’t surrounded with love and unconditional support from my partner Ignacio, who always knew how to encourage and motivate me, particularly in moments of my weakness and when I lacked creativity.
I would also like to thank other members of my family, especially my father Peter, who despite not sharing my excitement of data science, always listens patiently to my stories about emerging Big Data technologies and their use cases.
Also, I dedicate this book to my friends and former colleagues from UK Data Service at the University of Essex, where I had an opportunity to work with amazing individuals and experience the best practices in robust data management and processing.
Finally, I highly appreciate the hard work, expertise and feedback offered by many people involved in the creation of this book at Packt Publishing – especially my content development editor Onkar Wani, publishers, and the reviewers, who kindly shared their knowledge with me in order to create a quality and well-received publication.

About the Reviewers

Dr. Zacharias Voulgaris was born in Athens, Greece. He studied Production Engineering and Management at the Technical University of Crete, shifted to Computer Science through a Masters in Information Systems & Technology (City University, London), and then to Data Science through a PhD on Machine Learning (University of London). He has worked at Georgia Tech as a Research Fellow, at an e-marketing startup in Cyprus as an SEO manager, and as a Data Scientist in both Elavon (GA) and G2 (WA). He also was a Program Manager at Microsoft, on a data analytics pipeline for Bing.
Zacharias has authored two books and several scientific articles on Machine Learning and as well as a couple of articles on AI topics. His first book, Data Scientist - The Definitive Guide to Becoming a Data Scientist (Technics Publications), has been translated into Korean and Chinese, while his latest one, Julia for Data Science (Technics Publications) is coming out this September. He has also reviewed a number of data science books (mainly on Python and R) and has a passion for new technologies, literature, and music.
I'd like to thank the people at Packt for inviting me to review this book and for promoting Data Science and particularly Julia through their books. Also, a big thanks to all the great authors out there who choose to publish their work through the lesser-known publishers, keeping the whole process of sharing knowledge a democratic endeavor.
Dipanjan Sarkar is a Data Scientist at Intel, the world's largest silicon company which is on a mission to make the world more connected and productive. He primarily works on analytics, business intelligence, application development and building large scale intelligent systems. He received his Master's degree in Information Technology from the International Institute of Information Technology, Bangalore. His area of specialization includes software engineering, data science, machine learning and text analytics.
Dipanjan's interests include learning about new technology, disruptive start-ups, data science and more recently deep learning. In his spare time he loves reading, writing, gaming and watching popular sitcoms. He has authored a book on Machine Learning titled R Machine Learning by Example, Packt Publishing and also acted as a technical reviewer for severa...

Inhaltsverzeichnis