The Brilliance of Netezza
eBook - ePub

The Brilliance of Netezza

  1. 309 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

The Brilliance of Netezza

About this book

The Brilliance of Netezza will make all readers instant experts on the architecture of Netezza. With its clever design and "secret sauce" of FPGA Cards and Zone Maps, Netezza is quickly becoming a staple of all major enterprise data warehouses. This is a perfect opportunity to make yourself an invaluable asset to your company by obtaining the kind of knowledge that this book contains.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access The Brilliance of Netezza by Tom Coffing,John Nolan in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Warehousing. We have over one million books available in our catalogue for you to explore.
Chapter 1 – How Netezza Works
“Let me once again explain the rules. Tera-Tom books Rule!”
Tera-Tom Coffing
What is Parallel Processing?
“After enlightenment, the laundry”
- Zen Proverb
image
“After parallel processing the laundry, enlightenment!”
- Netezza Zen Proverb
Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said, “What!?” The first man explained that if he went to the laundry mat the next morning, he would be lucky to get one machine and be there all day. But if he went on Saturday night, he could get all the machines. Then, he could do all his wash and dry in two hours. Now that’s parallel processing mixed in with a little dry humor!
The Basics of a Single Computer
image
Data on disk does absolutely nothing. When data is requested, the computer moves the data one block at a time from disk into memory. Once the data is in memory, it is processed by the CPU at lightning speed. All computers work this way. The “Achilles Heel” of every computer is the slow process of moving data from disk to memory. That is all you need to know to be a computer expert!
Netezza Parallel Processes Data
“If the facts don’t fit the theory, change the facts.”
- Albert Einstein
image
Netezza has always been the pioneer in parallel processing and is even credited with the invention of the Appliance. In the picture above, you see that we have 16 orders; four orders placed on each disk. It appears to be four separate computers, but this is one system. Netezza systems work just like a basic computer as they still need to move data from disk into memory, but Netezza divides and conquers. Each Snippet Processing Unit (SPU) holds a portion of the data for every table.
Netezza is Born to be Parallel
image
Each SPU holds a portion of every table and is responsible for reading and writing the data that it is assigned to and from its disk. Queries are submitted to the host who plans, optimizes, and manages the execution of the query by sending the necessary snippets to each SPU. Each SPU performs its snippet or snippets independent of the others, completely following only the host’s plan. The final results of queries performed on each SPU is returned to the host where they can be combined and delivered back to the user.
Starts with a Linux User, a Database User and A Database
image
The host is a Linux server that runs the Netezza software and utilities. The host controls and coordinates the activity of the Netezza Appliance and performs query optimization. The host also controls table and database operations, gathers and returns query results, and monitors the Netezza system. Netezza systems have two hosts in a highly available (HA) configuration. The host is connected to the Netezza Database, which consists of a series of parallel processors called Snippet Processing Units (SPUs) often referred to as S-Blades. SPU and S-Blade are synonymous and the term is used interchangeably. This book will most often refer to them as SPUs.
Each SPU holds a Portion of Every Table
image
Every SPU has the exact same tables, but each SPU holds different rows of those tables.
When a table is created on Netezza, each SPU receives that table. When data is loaded, the rows are hashed by a distribution key, so each SPU holds a certain portion of the rows. If the host orders a full table scan of a particular table, then all SPUs simultaneously read their portion of the data. This is the concept of parallel processing.
The Rows of a Table are Spread Across All SPUs
image
A Distribution Key will be hashed to distribute the rows among the SPUs. Each SPU will hold a portion of the rows. This is the concept behind parallel processing.
The Brilliance of Netezza
image
The brilliance behind Netezza is in the Field Programmable Gate Array. It has many components. First, understand that all data on disk is compressed (average 4X). The FPGA card sits just outside the disk. When a query is run on a table, each SPU holds a portion of that data, so each SPU holds what is termed a slice of data. Each SPU has their own FPGA card and CPU. The first thing analyzed is the zone map to see if the data block has the qualifying data. If it does, then the data slice is transferred to the FPGA card where it is uncompressed. This strategy stores data compressed and transfers data compressed thus saving space and transfer time. The second thing the FPGA card does is to eliminate columns that are not needed, thus reducing the block even further. Next, the FPGA card eliminates unneeded rows. The FPGA card then sends the smaller block directly to the CPU for processing. This brilliant strategy delivers only the data needed to be processed and allows the CPU to focus on what it does best which is complex analysis, joins, and aggregations.
Compress Engine II – Adaptive Stream Compression
• Automatic, system-wide data compression when data is stored on disk.
• Zero tuning and zero administration required.
• A table is compressed using a patent-pending algorithm that actually compresses at the column level, but the data is stored as an entire row.
• There are different compression strategies that are based on the data in that column.
• All data types are compressed.
• 4X compression is the average, but up to 32X is possible.
• When data is being queried, it is first brought into the FPGA card where it is uncompressed there.
• Automatic compression saves enormous space on disk, and when blocks are transferred from disk into the FPGA card, there is less traffic.
The Achilles heal of any computer system is moving da...

Table of contents

  1. Cover
  2. Title Page
  3. The Tera-Tom Genius Series
  4. Tera-Tom- Author of over 50 Books
  5. The Best Query Tool Works on all Systems
  6. Trademarks and Copyrights
  7. About Tom Coffing
  8. About John Nolan
  9. Table of Contents
  10. Chapter 1 – How Netezza Works
  11. Chapter 2 – A Chip Off The Old Block
  12. Chapter 3 – How Netezza Distributes the Data
  13. Chapter 4 – Deep Dive Inside a Netezza Extent and Row
  14. Chapter 5 – How Joins Work Internally
  15. Chapter 6 – CTAS and CBT
  16. Chapter 7 - Temporary Tables
  17. Chapter 8 - Materialized Views
  18. Chapter 9 – Collecting Statistics
  19. Chapter 10 – Using nzsql
  20. Chapter 11 – Creating Tables
  21. Chapter 12 – Creating Databases and Users and Managing Them
  22. Chapter 13 – Systems Views
  23. Chapter 14 – Explains