Practical Big Data Analytics
Nataraj Dasgupta, Giancarlo Zaccone, Patrick Hannah
- 412 Seiten
- English
- ePUB (handyfreundlich)
- Über iOS und Android verfügbar
Practical Big Data Analytics
Nataraj Dasgupta, Giancarlo Zaccone, Patrick Hannah
Über dieses Buch
Get command of your organizational Big Data using the power of data science and analytics
Key Features
- A perfect companion to boost your Big Data storing, processing, analyzing skills to help you take informed business decisions
- Work with the best tools such as Apache Hadoop, R, Python, and Spark for NoSQL platforms to perform massive online analyses
- Get expert tips on statistical inference, machine learning, mathematical modeling, and data visualization for Big Data
Book Description
Big Data analytics relates to the strategies used by organizations to collect, organize and analyze large amounts of data to uncover valuable business insights that otherwise cannot be analyzed through traditional systems. Crafting an enterprise-scale cost-efficient Big Data and machine learning solution to uncover insights and value from your organization's data is a challenge. Today, with hundreds of new Big Data systems, machine learning packages and BI Tools, selecting the right combination of technologies is an even greater challenge. This book will help you do that.
With the help of this guide, you will be able to bridge the gap between the theoretical world of technology with the practical ground reality of building corporate Big Data and data science platforms. You will get hands-on exposure to Hadoop and Spark, build machine learning dashboards using R and R Shiny, create web-based apps using NoSQL databases such as MongoDB and even learn how to write R code for neural networks.
By the end of the book, you will have a very clear and concrete understanding of what Big Data analytics means, how it drives revenues for organizations, and how you can develop your own Big Data analytics solution using different tools and methods articulated in this book.
What you will learn
- - Get a 360-degree view into the world of Big Data, data science and machine learning
- - Broad range of technical and business Big Data analytics topics that caters to the interests of the technical experts as well as corporate IT executives
- - Get hands-on experience with industry-standard Big Data and machine learning tools such as Hadoop, Spark, MongoDB, KDB+ and R
- - Create production-grade machine learning BI Dashboards using R and R Shiny with step-by-step instructions
- - Learn how to combine open-source Big Data, machine learning and BI Tools to create low-cost business analytics applications
- - Understand corporate strategies for successful Big Data and data science projects
- - Go beyond general-purpose analytics to develop cutting-edge Big Data applications using emerging technologies
Who this book is for
The book is intended for existing and aspiring Big Data professionals who wish to become the go-to person in their organization when it comes to Big Data architecture, analytics, and governance. While no prior knowledge of Big Data or related technologies is assumed, it will be helpful to have some programming experience.
Häufig gestellte Fragen
Information
Big Data Mining with NoSQL
- Why NoSQL?
- NoSQL databases
- In-memory databases
- Columnar databases
- Document-oriented databases
- Key-value databases
- Graph databases
- Other NoSQL types and summary
- Hands-on exercise on NoSQL systems
Why NoSQL?
The ACID, BASE, and CAP properties
ACID and SQL
- Atomicity: This indicates that database transactions either execute in full or do not execute at all. In other words, either all transactions should be committed, that is, persisted in their entirety, or not committed at all. There is no scope for a partial execution of a transaction.
- Consistency: The constraints on the data, that is, the rules that determine data management within a database, will be consistent throughout the database. Different instances will not abide by rules that are any different to those in other instances of the database.
- Isolation: This property defines the rules of how concurrent operations (transactions) will read and write data. For example, if a certain record is being updated while another process reads the same record, the isolation level of the database system will determine which version of the data would be returned back to the user.
- Durability: The durability of a database system generally indicates that committed transactions will remain persistent even in the event of a system failure. This is generally managed by the use of transaction logs that databases can refer to during recovery.
- User withdraws cash from an ATM
- The bank checks the current balance of the user
- The database system deducts the corresponding amount from the user's account
- The database system updates the amount in the user's account to reflect the change