The Data Warehouse Toolkit
eBook - ePub

The Data Warehouse Toolkit

The Definitive Guide to Dimensional Modeling

Ralph Kimball, Margy Ross

Buch teilen
  1. English
  2. ePUB (handyfreundlich)
  3. Über iOS und Android verfĂŒgbar
eBook - ePub

The Data Warehouse Toolkit

The Definitive Guide to Dimensional Modeling

Ralph Kimball, Margy Ross

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence!

The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more.

  • Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence
  • Begins with fundamental design recommendations and progresses through increasingly complex scenarios
  • Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more
  • Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more

Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

HĂ€ufig gestellte Fragen

Wie kann ich mein Abo kĂŒndigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kĂŒndigen“ – ganz einfach. Nachdem du gekĂŒndigt hast, bleibt deine Mitgliedschaft fĂŒr den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich BĂŒcher herunterladen?
Derzeit stehen all unsere auf MobilgerĂ€te reagierenden ePub-BĂŒcher zum Download ĂŒber die App zur VerfĂŒgung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die ĂŒbrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den AboplÀnen?
Mit beiden AboplÀnen erhÀltst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst fĂŒr LehrbĂŒcher, bei dem du fĂŒr weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhĂ€ltst. Mit ĂŒber 1 Million BĂŒchern zu ĂŒber 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
UnterstĂŒtzt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nÀchsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist The Data Warehouse Toolkit als Online-PDF/ePub verfĂŒgbar?
Ja, du hast Zugang zu The Data Warehouse Toolkit von Ralph Kimball, Margy Ross im PDF- und/oder ePub-Format sowie zu anderen beliebten BĂŒchern aus Ciencia de la computaciĂłn & Almacenamiento de datos. Aus unserem Katalog stehen dir ĂŒber 1 Million BĂŒcher zur VerfĂŒgung.

Information

Verlag
Wiley
Jahr
2013
ISBN
9781118732281

1
Data Warehousing, Business Intelligence, and Dimensional Modeling Primer

This first chapter lays the groundwork for the following chapters. We begin by considering data warehousing and business intelligence (DW/BI) systems from a high-level perspective. You may be disappointed to learn that we don’t start with technology and tools—first and foremost, the DW/BI system must consider the needs of the business. With the business needs firmly in hand, we work backwards through the logical and then physical designs, along with decisions about technology and tools.
We drive stakes in the ground regarding the goals of data warehousing and business intelligence in this chapter, while observing the uncanny similarities between the responsibilities of a DW/BI manager and those of a publisher.
With this big picture perspective, we explore dimensional modeling core concepts and establish fundamental vocabulary. From there, this chapter discusses the major components of the Kimball DW/BI architecture, along with a comparison of alternative architectural approaches; fortunately, there’s a role for dimensional modeling regardless of your architectural persuasion. Finally, we review common dimensional modeling myths. By the end of this chapter, you’ll have an appreciation for the need to be one-half DBA (database administrator) and one-half MBA (business analyst) as you tackle your DW/BI project.
Chapter 1 discusses the following concepts:
  • Business-driven goals of data warehousing and business intelligence
  • Publishing metaphor for DW/BI systems
  • Dimensional modeling core concepts and vocabulary, including fact and dimension tables
  • Kimball DW/BI architecture’s components and tenets
  • Comparison of alternative DW/BI architectures, and the role of dimensional modeling within each
  • Misunderstandings about dimensional modeling

Different Worlds of Data Capture and Data Analysis

One of the most important assets of any organization is its information. This asset is almost always used for two purposes: operational record keeping and analytical decision making. Simply speaking, the operational systems are where you put the data in, and the DW/BI system is where you get the data out.
Users of an operational system turn the wheels of the organization. They take orders, sign up new customers, monitor the status of operational activities, and log complaints. The operational systems are optimized to process transactions quickly. These systems almost always deal with one transaction record at a time. They predictably perform the same operational tasks over and over, executing the organization’s business processes. Given this execution focus, operational systems typically do not maintain history, but rather update data to reflect the most current state.
Users of a DW/BI system, on the other hand, watch the wheels of the organization turn to evaluate performance. They count the new orders and compare them with last week’s orders, and ask why the new customers signed up, and what the customers complained about. They worry about whether operational processes are working correctly. Although they need detailed data to support their constantly changing questions, DW/BI users almost never deal with one transaction at a time. These systems are optimized for high-performance queries as users’ questions often require that hundreds or hundreds of thousands of transactions be searched and compressed into an answer set. To further complicate matters, users of a DW/BI system typically demand that historical context be preserved to accurately evaluate the organization’s performance over time.
In the first edition of The Data Warehouse Toolkit (Wiley, 1996), Ralph Kimball devoted an entire chapter to describe the dichotomy between the worlds of operational processing and data warehousing. At this time, it is widely recognized that the DW/BI system has profoundly different needs, clients, structures, and rhythms than the operational systems of record. Unfortunately, we still encounter supposed DW/BI systems that are mere copies of the operational systems of record stored on a separate hardware platform. Although these environments may address the need to isolate the operational and analytical environments for performance reasons, they do nothing to address the other inherent differences between the two types of systems. Business users are underwhelmed by the usability and performance provided by these pseudo data warehouses; these imposters do a disservice to DW/BI because they don’t acknowledge their users have drastically different needs than operational system users.

Goals of Data Warehousing and Business Intelligence

Before we delve into the details of dimensional modeling, it is helpful to focus on the fundamental goals of data warehousing and business intelligence. The goals can be readily developed by walking through the halls of any organization and listening to business management. These recurring themes have existed for more than three decades:
  • “We collect tons of data, but we can’t access it.”
  • “We need to slice and dice the data every which way.”
  • “Business people need to get at the data easily.”
  • “Just show me what is important.”
  • “We spend entire meetings arguing about who has the right numbers rather than making decisions.”
  • “We want people to use information to support more fact-based decision making.”
Based on our experience, these concerns are still so universal that they drive the bedrock requirements for the DW/BI system. Now turn these business management quotations into requirements.
  • The DW/BI system must make information easily accessible. The contents of the DW/BI system must be understandable. The data must be intuitive and obvious to the business user, not merely the developer. The data’s structures and labels should mimic the business users’ thought processes and vocabulary. Business users want to separate and combine analytic data in endless combinations. The business intelligence tools and applications that access the data must be simple and easy to use. They also must return query results to the user with minimal wait times. We can summarize this requirement by simply saying simple and fast.
  • The DW/BI system must present information consistently. The data in the DW/BI system must be credible. Data must be carefully assembled from a variety of sources, cleansed, quality assured, and released only when it is fit for user consumption. Consistency also implies common labels and definitions for the DW/BI system’s contents are used across data sources. If two performance measures have the same name, they must mean the same thing. Conversely, if two measures don’t mean the same thing, they should be labeled differently.
  • The DW/BI system must adapt to change. User needs, business conditions, data, and technology are all subject to change. The DW/BI system must be designed to handle this inevitable change gracefully so that it doesn’t invalidate existing data or applications. Existing data and applications should not be changed or disrupted when the business community asks new questions or new data is added to the warehouse. Finally, if descriptive data in the DW/BI system must be modified, you must appropriately account for the changes and make these changes transparent to the users.
  • The DW/BI system must present information in a timely way. As the DW/BI system is used more intensively for operational decisions, raw data may need to be converted into actionable information within hours, minutes, or even seconds. The DW/BI team and business users need to have realistic expectations for what it means to deliver data when there is little time to clean or validate it.
  • The DW/BI system must be a secure bastion that protects the information assets. An organization’s informational crown jewels are stored in the data warehouse. At a minimum, the warehouse likely contains information about what you’re selling to whom at what price—potentially harmful details in the hands of the wrong people. The DW/BI system must effectively control access to the organization’s confidential information.
  • The DW/BI system must serve as the authoritative and trustworthy foundation for improved decision making. The data warehouse must have the right data to support decision making. The most important outputs from a DW/BI system are the decisions that are made based on the analytic evidence presented; these decisions deliver the business impact and value attributable to the DW/BI system. The original label that predates DW/BI is still the best description of what you are designing: a decision support system.
  • The business community must accept the DW/BI system to deem it successful. It doesn’t matter that you built an elegant solution using best-of-breed products and platforms. If the business community does not embrace the DW/BI environment and actively use it, you have failed the acceptance test. Unlike an operational system implementation where business users have no choice but to use the new system, DW/BI usage is sometimes optional. Business users will embrace the DW/BI system if it is the “simple and fast” source for actionable information.
Although each requirement on this list is important, the final two are the most critical, and unfortunately, often the most overlooked. Successful data warehousing and business intelligence demands more than being a stellar architect, technician, modeler, or database administrator. With a DW/BI initiative, you have one foot in your informati...

Inhaltsverzeichnis