Aster Data SQL and MapReduce
eBook - ePub

Aster Data SQL and MapReduce

Tom Coffing, John Nolan

Buch teilen
  1. 437 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Aster Data SQL and MapReduce

Tom Coffing, John Nolan

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

The Aster Data SQL and MapReduce book shows you the fundamental architecture that will allow you to fully understand how Aster Data works. You will be able to create tables, perform partitioning, follow the best Aster modeling strategies, and have a great reference for Aster Data SQL. You will also elevate your knowledge immensely from the brilliant MapReduce portion of the books. You will have MapReduce examples, explanations, and workshops that are designed to make you a MapReduce wizard.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Aster Data SQL and MapReduce als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Aster Data SQL and MapReduce von Tom Coffing, John Nolan im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Informatik & Data-Warehousing. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Jahr
2014
ISBN
9781940540238

Chapter 1 – The Aster Data Architecture

“Design is not just what it looks like and feels like. Design is how it works.”
- Steve Jobs

What is Parallel Processing?

“After enlightenment, the laundry”
- Zen Proverb
image
“After parallel processing the laundry, enlightenment!”
-Aster Zen Proverb
Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said, “What?!” The man explained that if he went to the laundry mat the next morning, he would be lucky to get one machine and be there all day. But, if he went on Saturday night, he could get all the machines. Then, he could do all his wash and dry in two hours. Now that’s parallel processing mixed in with a little dry humor!

Aster Data is a Parallel Processing System

image
The queen takes the request from the user and builds the plan for the vworkers. The vworkers retrieve their portion of the data and pass the results to the queen. The queen delivers the answer set to the user.
Each vworker holds a portion of every table and is responsible for reading and writing the data that it is assigned to and from its disk. Queries are submitted to the queen who plans, optimizes, and manages the execution of the query by sending the necessary subqueries to each vworker. Each vworker performs its subquery or subqueries independent of the others, completely following only the queen’s plan. The final results of queries performed on each vworker is returned to the queen where they can be combined and delivered back to the user.

Each vworker holds a Portion of Every Table

image
Every vworker has the exact same tables, but each vworker holds different rows of those tables.
When a table is created on Aster, each vworker receives that table. When data is loaded, the rows are hashed by a distribution key so each vworker holds a certain portion of the rows. If the queen orders a full table scan of a particular table, then all vworkers simultaneously read their portion of the data. This is the concept of parallel processing.

The Rows of a Table are Spread Across All vworkers

image
A Distribution Key will be hashed to distribute the rows among the vworkers. Each vworker will hold a portion of the rows. This is the concept behind parallel processing.

Aster Tables are defined as Fact or Dimension when Created

image
An Aster Table will be either a Fact or Dimension Table. Fact tables are usually large, and dimension tables are relatively smaller. Fact tables will generally be distributed by hash on a distribution key which is a key column in the table. Dimension tables are usually distributed by replicating the table across all vworkers.

Fact Table

image
A Distribution Key will be hashed to distribute the rows among the vworkers.

A More Detailed Look at the Fact Table Distribution

image
A Distribution Key will be hashed to distribute the rows among the vworkers. The entire row will be held by the vworker, but the row finds its vworker based on hash.

Dimension Table are Replicated

image
Dimension tables are relatively smaller than the large fact table they join to. Dimension tables are usually, but not always, distributed by replicating the table across all vworkers. That means that each vworker has the exact same copy of the entire table.

A Dimension Table is often Replicated across vworkers

image
A replicated table is copied in its entirety to all vworkers.
Fact and Dimension tables are created in this manner for join purposes. Dimension tables are smaller so they are replicated, but Fact tables are distributed by a hash key.

Aster Data has Fact and Dimens...

Inhaltsverzeichnis