Big Data Architect's Handbook
eBook - ePub

Big Data Architect's Handbook

A guide to building proficiency in tools and systems used by leading big data experts

Syed Muhammad Fahad Akhtar

Partager le livre
  1. 486 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Big Data Architect's Handbook

A guide to building proficiency in tools and systems used by leading big data experts

Syed Muhammad Fahad Akhtar

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

A comprehensive end-to-end guide that gives hands-on practice in big data and Artificial Intelligence

Key Features

  • Learn to build and run a big data application with sample code
  • Explore examples to implement activities that a big data architect performs
  • Use Machine Learning and AI for structured and unstructured data

Book Description

The big data architects are the "masters" of data, and hold high value in today's market. Handling big data, be it of good or bad quality, is not an easy task. The prime job for any big data architect is to build an end-to-end big data solution that integrates data from different sources and analyzes it to find useful, hidden insights.

Big Data Architect's Handbook takes you through developing a complete, end-to-end big data pipeline, which will lay the foundation for you and provide the necessary knowledge required to be an architect in big data. Right from understanding the design considerations to implementing a solid, efficient, and scalable data pipeline, this book walks you through all the essential aspects of big data. It also gives you an overview of how you can leverage the power of various big data tools such as Apache Hadoop and ElasticSearch in order to bring them together and build an efficient big data solution.

By the end of this book, you will be able to build your own design system which integrates, maintains, visualizes, and monitors your data. In addition, you will have a smooth design flow in each process, putting insights in action.

What you will learn

  • Learn Hadoop Ecosystem and Apache projects
  • Understand, compare NoSQL database and essential software architecture
  • Cloud infrastructure design considerations for big data
  • Explore application scenario of big data tools for daily activities
  • Learn to analyze and visualize results to uncover valuable insights
  • Build and run a big data application with sample code from end to end
  • Apply Machine Learning and AI to perform big data intelligence
  • Practice the daily activities performed by big data architects

Who this book is for

Big Data Architect's Handbook is for you if you are an aspiring data professional, developer, or IT enthusiast who aims to be an all-round architect in big data. This book is your one-stop solution to enhance your knowledge and carry out easy to complex activities required to become a big data architect.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Big Data Architect's Handbook est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Big Data Architect's Handbook par Syed Muhammad Fahad Akhtar en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Data Processing. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2018
ISBN
9781788836388
Édition
1
Sous-sujet
Data Processing

NoSQL Database

Nowadays, there is so much hype about NoSQL databases, especially in the big data world. People seem to be discussing different aspects of NoSQL and how they can get the most out of it. Different types of questions come to their minds, such as what is it? How it is different from RDBMS? How do I select an appropriate framework and tool while architecting my project?
In this chapter, we will go through NoSQL and answer all of these questions to build a strong foundation. We will then cover the following NoSQL databases from practical aspects, which includes their installation, basic configuration, and most of the operations that we normally perform in a database. We will be mainly discussing the following topics:
  • What is NoSQL?
  • Benefits of NoSQL
  • Comparison of NoSQL and RDBMS
  • CAP theorem and ACID properties
  • Different data models in NoSQL
  • Apache Cassandra
  • MongoDB
  • Neo4j
Let's start exploring the NoSQL world with a question: what is NoSQL?

What is NoSQL?

So far, we have the understanding that when we say the word database, the most common definition that comes to our mind is very well structured and formatted data stored in a tabular form. Now the question is, what will happen with a large amount of data that is unstructured and doesn't have the proper formatting or schema? What will we do then? Here is where NoSQL database comes into the picture. It is a mechanism for storing data that doesn't have any fixed schema. Most of the people assume that it means No SQL, whereas the actual abbreviation is from Not Only SQL. It means that it doesn't rely only on the SQL programming language for manipulating and storing data, but it can be used in conjunction with other programming languages. Now, we will discuss some of the benefits of NoSQL databases.

Benefits of NoSQL databases

NoSQL helps us to deal with data that we were not able to store or maintain using traditional system approaches. The following are the key benefits of NoSQL databases:
  • NoSQL provides schema less data storage is one of the main advantages. It will allow the storing of all types of data in different formats and in different schemas, thereby providing more robust and agile development.
  • NoSQL servers scales horizontally, which means it is very easy to scale the capacity up or down. Simply add new servers or remove servers to increase or decrease its capacity, storage, and computation power.
  • NoSQL works in a clustered environment that is mainly built on commodity hardware, which is much less expensive than a highly reliable server without affecting the performance or reliability.
  • NoSQL databases spread across multiple nodes with replication to multiple servers. Some of NoSQL database frameworks even work without the master slave concept, which makes them highly available with no single point of failure.
These are some of the key advantages of using NoSQL databases over the traditional approach. Now, moving forward, we will compare NoSQL with RDBMS databases to give you a clear understanding of the differences between both database types.

NoSQL versus RDBMS

We will now discuss the different characteristics of NoSQL and relational databases with a point by point comparison to give you a clear understanding:
RDBMS
NoSQL
RDBMS are relational databases.
NoSQL are normally non-relational databases or distributed databases.
RDBMS databases store data in tabular form, which mean it contains data rows and columns.
NoSQL databases are of key-value, document, column based, or graph based datastores.
RDBMS has predefined schema.
NoSQL have dynamic schema.
RDBMS is vertically scalable. It means you can only increase the hardware on the same server, increasing the computation or other hardware resources.
NoSQL database is horizontally scalable. It means you can add more servers in the cluster to increase the computation power and different hardware resources, such as storage and memory.
RDBMS relies on standard query language for all the related operations. Different RDBMS tools extend the features of the SQL by introducing their own advanced function to be utilized for handling databases.
NoSQL uses the basic format of the standard query language. It may differ from framework to framework, whichever you select to handle NoSQL database.
SQL supports complex and nested queries to extract the desired output.
NoSQL frameworks normally handle the basic CRUD operations as far as support to SQL is concerned. It doesn't have the interface to handle complex and nested queries.
SQL is more reliable in performing high transactional values.
NoSQL doesn't support high transactional operations.
SQL replies on ACID (atomicity, consistency, isolation, durability) properties. We will discuss ACID properties in detail as we proceed in this chapter.
NoSQL mainly follows the CAP (consistency, availability, partition) theorem. We will discuss the CAP theorem in detail as we proceed in this chapter.
SQL databases are classified based on open source or commercial application.
NoSQL classifies its different database frameworks based on the data model type it supports, such as key-value data stores, and column stores, and so on.
Example of RDBMS are MySQL, Microsoft SQL, Oracle, Postgres.
Example of NoSQL frameworks are Apache Cassandra, MongoDB, HBase, Redis, Neo4j.
SQL stands for Standard Query Language. It is a programming language used in RDBMS (Relational Database Management System) to store and retrieve data from a structured databases. It is as per the ANSI standard.
Now we will discuss the CAP theorem, which is related to NoSQL database, and ACID properties, which are related to SQL database, in order to understand some of the differences between NoSQL and Relational databases, as mentioned earlier.

The C...

Table des matiĂšres