SQL Server MVP Deep Dives, Volume 2
eBook - ePub

SQL Server MVP Deep Dives, Volume 2

Greg Low, Louis Davidson, Kimberly Tripp, Kalen Delaney, Paul Nielsen, Brad McGehee, Paul S. Randal

Share book
  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

SQL Server MVP Deep Dives, Volume 2

Greg Low, Louis Davidson, Kimberly Tripp, Kalen Delaney, Paul Nielsen, Brad McGehee, Paul S. Randal

Book details
Book preview
Table of contents
Citations

About This Book

SQL Server MVP Deep Dives, Volume 2 lets you learn from the best in the business—64 SQL Server MVPs offer completely new content in this second volume on topics ranging from testing and policy management to integration services, reporting, and performance optimization techniques...and more.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is SQL Server MVP Deep Dives, Volume 2 an online PDF/ePUB?
Yes, you can access SQL Server MVP Deep Dives, Volume 2 by Greg Low, Louis Davidson, Kimberly Tripp, Kalen Delaney, Paul Nielsen, Brad McGehee, Paul S. Randal in PDF and/or ePUB format, as well as other popular books in Informatica & Programmazione in SQL. We have over one million books available in our catalogue for you to explore.

Information

Year
2011
ISBN
9781617290473

Part 1. Architecture

Edited by Louis Davidson
Database design and architecture are subjects that can be quite polarizing. You’re either super excited by them, or they tend to make you bored to tears and ready to hurl cartoon birds at cartoon pigs. At the SQL PASS conference last year, a keynote was given by Dr. David DeWitte, a Technical Fellow in Microsoft’s Data and Storage Platform Division. For a great number of people, it was amazing stuff that made us sad when he finished talking. For quite a few more, it was painfully uninteresting, because, quoting a person I overheard, “I’m not going to use this.” As a speaker, I’ve given a number of sessions on normalization that get an average speaker score for usefulness that was just above 3 out of 5, because half of the people loved it and the other half wanted me to speak about something else entirely but were kind enough to give me a 2 instead of the dreaded 1.
There in a nutshell loomed the $64,000 question. Is knowing about the architecture of SQL Server internals useful? Is understanding the proper way to do database design of any real value, even if you’re never going to directly design a query optimizer or even a database? Architectural understanding is like understanding why certain foods are good for you. A little understanding will help guide your eating habits, just like a bit of architectural knowledge will give you some insight into how to design and program to meet the needs of SQL Server’s architecture.
One of the underlying goals of SQL is to be a completely declarative language where you tell the computer what you want to do and it does so almost magically. In reality, though, it’s not quite that simple unless you have very big hardware and very small needs. SQL Server doesn’t run like an Xbox game on a console, where all programs are built to work on one set of hardware parameters. It’s architected to run on lowly computers like my little laptop and on computers with more CPUs than you can count on your fingers and toes. Understanding the architecture of the SQL Server engine and how it works with the hardware will help you optimize your code and the servers where it operates. And understanding how the optimizer works will help you see why normalization and constraints are usually good for performance rather than bad.
Architecture is a wide topic, so with six chapters we have an interesting range of subjects. We have two chapters that largely center on different aspects of keys and uniqueness, two that are general database architecture overviews, one on generalizing your designs, and even one that’s about physical storage architecture. A bit of a hodgepodge to be sure, but as you read the rest of this book, you’ll notice that the coding, reporting, and operational types of chapters are far more plentiful because, as a whole, MVPs aren’t theorists or deep internal architecture implementers but expert users of the technology with a solid knowledge of how the architecture affects them and a desire to share their knowledge with you.
The focus on data architecture in the lion’s share of the sections shouldn’t surprise you. Database design is something that everyone has to do, and it’s quite difficult to do correctly, not so much due to the fact that the task is difficult from a technical or theoretical side (in fact it turns out to be easy to apply), but because it’s rare to start a completely new database with completely new code to access it. In this section you’ll get some guidance on how to implement a solid, working, and well-formed database prepared for today’s business as well as tomorrow’s.

About the editor

Louis Davidson has been in the IT industry for 17 years as a corporate database developer and architect. He’s been a Microsoft MVP for 7 years and has written 4 books on database design. Currently, Louis serves as the Data Architect for the Christian Broadcasting Network, supporting offices in Virginia Beach, Virginia, and Nashville, Tennessee. Louis has a bachelor’s degree in computer science from the University of Tennessee at Chattanooga, with a minor in mathematics. For more information, please visit his website at http://drsql.org.

Chapter 1. Where are my keys?

Ami Levin
If you walk into a room full of DBAs or DB developers and you feel like having the same kind of fun as setting fire to a dry hayfield, just drop this question: “What’s a better design, using natural keys or artificial keys?” Satisfaction guaranteed. When I started to study database design, this was one of the first hot controversies I encountered. If you Google the phrase “natural vs. artificial keys,” you’ll come up with more than 150 million results, including endless debates, numerous articles, blog posts with passionate replies, long theoretical and practical arguments, and even the occasional profanity. In this chapter, I would like to take you on a tour, very much the same way I have traveled with this dispute, and I hope that together we can reach some useful insights into the essence of both positions. Who can tell? Perhaps things won’t look as stark afterward.
The seeds of the dispute, curiously, were planted by the inventors of the relational model. The basis for all major databases today was first introduced by Edgar F. Codd of IBM in 1969 and later extended in collaboration with Chris Date and others. Although they were in accord on the main tenets of the model, the relational database forefathers held slightly different views on certain aspects of the model, and this early difference of opinions led to current-day repercussions, as you’ll see later. Date seemed to be a keen supporter of using artificial keys, but Codd had quite a few reservations on the matter.
What caused this difference of opinion to grow into a controversy, and what has kept the battle alive for so long? Its growth to the rank of controversy stems from the fact that database design has a crucial impact on performance, modularity, consistency, and scalability. This makes the issue of correct database design ever so important. Moreover, once in production the basic design of a database is probably the hardest aspect to revise.
I hope that when you approach your next design project, this chapter will prompt you to take the few extra hours, days, or weeks to consider your key selection with the seriousness that it truly deserves. If this chapter will help you save even one hour of work or one dollar of unnecessary expense in the future, it was worth all the work I’ve invested in it.
The unforeseen consequences of database design flaws
I’ve seen numerous cases where the DBAs and developers admit that there are serious flaws in the database design (“I inherited it this way”), but they’ll probably have to live with those flaws for many years to come. Their very valid reason is that the original design was created many years ago when the design characteristics, now recognized as flaws, had little or no negative impact. Since then, the natural evolution and expansion of the database have magnified the impact of those flaws. If the original designers had known the true extent of the resources that would be spent as a direct result of their design flaws, I’m sure they would have taken the time to think things over once more.

Keys in the relational model

The importance of keys is apparent in seminal works, and in particular the works regarding data normalization. The concept of normalization was first introduced by Codd in his ground-breaking paper “A Relational Model of Data for Large Shared Data Banks.” This paper, which set the foundations for all relational databases as we know them today, is freely available on the internet, and I highly recommend it to anyone who deals with database design. A few years later, Codd and Date elaborated this concept into the normal forms as we know them today.
It’s obvious from the normalization rules that keys are a fundamental entity that plays a critical role in the design of a relational database. The schema design is tested and validated for correctness based on the keys and how they relate to all the nonkey columns that make up your database tables. Choosing the correct model to represent the reality the database will serve is fully key dependent. How much thought have you given your keys until now?
Databases are a mirror of reality
I’d like to emphasize the fact that a relational model database is (surprise, surprise)... a model. That may sound obvious, but it does carry a lot of meaning and corresponding responsibility. Dictionary.com defines model as “a simplified represen-tation of a system or phenomenon, as in the sciences or economics, with any hypotheses required to describe the system or explain the phenomenon, often mathematically.” A database models, in relational form or some other form, some physical or logical universe, entity, or phenomenon (or a part of one of those). It’s designed to be a mirror, in relational language and terms, of its real characteristics and functions.
A key was originally defined by Codd as follows: “Normally, one domain [column] (or combination of domains) of a given relation [table] has values which uniquely identify each element [row] (n-tuple) of that relation. Such a domain is called a primary key.” Codd realized that there may be multiple columns within a table that may be candidates to identify the row. Therefore he offers, “Whenever a relation has two or more non-redundant primary keys, one of them is arbitrarily selected and called the primary key of that relation.”
So the primary key is an arbitrary selection! Or is it? Although logically any of the candidate keys can be used to identify the row, a primary key must shoulder several other responsibilities besides unique identification. These include most often serving as the parent node in foreign key relationships, being used by default as the clustering key for the table’s physical index structure, and not allowing NULL values. In this chapter, I’ll use the following highly simplified definitions of the various types of keys:
  • A simple key is a key that consists of a single column.
  • A composite key is a key that consists of more than one column.
  • A candidate key is a key (simple or composite) that could be used to uniquely identify the row.
  • A primary key is the key selected from among the candidate keys to...

Table of contents