Joe Celko's SQL for Smarties
eBook - ePub

Joe Celko's SQL for Smarties

Advanced SQL Programming

  1. 840 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Joe Celko's SQL for Smarties

Advanced SQL Programming

About this book

SQL for Smarties was hailed as the first book devoted explicitly to the advanced techniques needed to transform an experienced SQL programmer into an expert. Now, 10 years later and in the third edition, this classic still reigns supreme as the book written by an SQL master that teaches future SQL masters. These are not just tips and techniques; Joe also offers the best solutions to old and new challenges and conveys the way you need to think in order to get the most out of SQL programming efforts for both correctness and performance. In the third edition, Joe features new examples and updates to SQL-99, expanded sections of Query techniques, and a new section on schema design, with the same war-story teaching style that made the first and second editions of this book classics.- Expert advice from a noted SQL authority and award-winning columnist, who has given ten years of service to the ANSI SQL standards committee and many more years of dependable help to readers of online forums.- Teaches scores of advanced techniques that can be used with any product, in any SQL environment, whether it is an SQL-92 or SQL-99 environment.- Offers tips for working around system deficiencies.- Continues to use war stories--updated!--that give insights into real-world SQL programming challenges.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Joe Celko's SQL for Smarties by Joe Celko in PDF and/or ePUB format, as well as other popular books in Computer Science & Databases. We have over one million books available in our catalogue for you to explore.

Information

CHAPTER 1

Database Design
THIS CHAPTER DISCUSSES THE DDL (Data Definition Language), which is used to create a database schema. It is related to the next chapter on the theory of database normalization. Most bad queries start with a bad schema. To get data out of the bad schema, you have to write convoluted code, and you are never sure if it did what it was meant to do.
One of the major advantages of databases, relational and otherwise, was that the data could be shared among programs so that an enterprise could use one trusted source for information. Once the data was separated from the programs, we could build tools to maintain, back up, and validate the data in one place, without worrying about hundreds or even thousands of application programs possibly working against each other.
SQL has spawned a whole branch of data modeling tools devoted to designing its schemas and tables. Most of these tools use a graphic or text description of the rules and the constraints on the data to produce a schema declaration statement that can be used directly in a particular SQL product. It is often assumed that a CASE tool will automatically prevent you from creating a bad design. This is simply not true.
Bad schema design leads to weird queries that are trying to work around the flaws. These flaws can include picking the wrong data types, denormalization, and missing or incorrect constraints. As Elbert Hubbard (American author, 1856–1915) put it: “Genius may have its limitations, but stupidity is not thus handicapped.”

1.1 Schema and Table Creation

The major problem in learning SQL is that programmers are used to thinking in terms of files rather than tables.
Programming languages are usually based on some underlying model; if you understand the model, the language makes much more sense. For example, FORTRAN is based on algebra. This does not mean that FORTRAN is exactly like algebra. But if you know algebra, FORTRAN does not look all that strange to you. You can write an expression in an assignment statement or make a good guess as to the names of library functions you have never seen before.
Programmers are used to working with files in almost every other programming language. The design of files was derived from paper forms; they are very physical and very dependent on the host programming language. A COBOL file could not easily be read by a FORTRAN program, and vice versa. In fact, it was hard to share files even among programs written in the same programming language!
The most primitive form of a file is a sequence of records, ordered within the file and referenced by physical position. You open a file, then read a first record, followed by a series of next records until you come to the last record to raise the end-of-file condition. You navigate among these records and perform actions one record at a time. The actions you take on one file have no effect on other files that are not in the same program. Only programs can change files.
The model for SQL is data kept in sets, not in physical files. The “unit of work” in SQL is the whole schema, not individual tables.
Sets are those mathematical abstractions you studied in school. Sets are not ordered, and the members of a set are all of the same type. When you perform an operation on a set, the action happens “all at once” to the entire membership of the set. That is, if I ask for the subset of odd numbers from the set of positive integers, I get all of them back as a single set. I do not build the set of odd numbers by sequentially inspecting one element at a time. I define odd numbers with a rule “If the remainder is 1 when you divide the number by 2, it is odd” that could test any integer and classify it. Parallel processing is one of many, many advantages of having a set-oriented model.
SQL is not a perfect set language any more than FORTRAN is a perfect algebraic language, as we will see. But if you are in doubt about something in SQL, ask yourself how you would specify it in terms of sets, and you will probably get the right answer.

1.1.1 CREATE SCHEMA Statement

A CREATE SCHEMA statement, defined in the SQL Standard, brings an entire schema into existence all at once. In practice, each product has very different utility programs to allocate physical storage and define a schema. Much of the proprietary syntax is concerned with physical storage allocations.
A schema must have a name and a default character set, usually ASCII or a simple Latin alphabet as defined in the ISO Standards. There is an optional AUTHORIZATION clause that holds a <schema authorization identifier> for security. After that the schema is a list of schema elements:
A schema is the skeleton of an SQL database; it defines the structures of the schema objects and the rules under which they operate. The data is the meat on that skeleton.
The only data structure in SQL is the table. Tables can be persistent (base tables), used for working storage (temporary tables), or virtual (VIEWs, common table expressions, and derived tables). The differences among these types are in implementation, not performance. One advantage of having only one data structure is that the results of all operations are also tables, you never have to convert structures, write special operators, or deal with any irregularity in the language.
The <grant statement> has to do with limiting user access to certain schema elements. The <assertion definition> is not widely implemented yet, but it works as a constraint that applies to the schema as awhole. Finally, the <character set definition>, <collation definition>, and <translation definition> deal with the display of data. We are not really concerned with any of these schema objects; they are usually set in place by the DBA (database administrator) for the users, and we mere programmers do not get to change them.
Conceptually, a table is a set of zero or more rows, and a row is a set of one or more columns. Each column has a specific data type and constraints that make up an implementation of an abstract domain. The way a table is physically implemented does not matter, because you only access it with SQL. The database engine handles all the details for you and you never worry about the internals, as you would with a physical file.
In fact, almost no two SQL products use the same internal structures. SQL Server uses physically contiguous storage accessed by two kinds of indexes; Teradata uses hashing; Nucleus (SAND Technology) uses compressed bit vector; Informix and CA-Ingres use more than a dozen different kinds of indexes.
There are two common conceptual errors made by programmers who are accustomed to file systems or PCs. The first is thinking that a table is a file; the second is thinking that a table is a spreadsheet. Tables do not behave like either, and you will get surprises if you do not understand the basic concepts.
It is easy to imagine that a table is a file, a row is a record, and a column is a field. This concept is familiar, and when data moves from SQL to the host language, it must be converted into host language data types and data structures to be displayed and used.
The big differences between working with a file system and working with SQL are in the way SQL fits into a host program. If you are using a file system, your programs must open and close files individually. In SQL, the whole schema is connected to or disconnected from the program as a single unit. The host program might not be authorized to see or manipulate all of the tables and other schema objects, but that is established as part of the connection.
Th...

Table of contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Dedication
  5. Introduction to the Third Edition
  6. Table of Contents
  7. Chapter 1: Database Design
  8. Chapter 2: Normalization
  9. Chapter 3: Numeric Data in SQL
  10. Chapter 4: Temporal Data Types in SQL
  11. Chapter 5: Character Data Types in SQL
  12. Chapter 6: NULLs
  13. Chapter 7: Multiple Column Data Elements
  14. Chapter 8: Table Operations
  15. Chapter 9: Comparison or Theta Operators
  16. Chapter 10: Valued Predicates
  17. Chapter 11: CASE Expressions
  18. Chapter 12: LIKE Predicate
  19. Chapter 13: Between and Overlaps Predicates
  20. Chapter 14: The [NOT] IN() predicate
  21. Chapter 15: EXISTS() predicate
  22. Chapter 16: Quantified Subquery Predicates
  23. Chapter 17: The SELECT Statement
  24. Chapter 18: VIEWs, Derived Tables, Materialized Tables, and Temporary Tables
  25. Chapter 19: Partitioning Data in Queries
  26. Chapter 20: Grouping Operations
  27. Chapter 21: Aggregate Functions
  28. Chapter 22: Auxiliary Tables
  29. Chapter 23: Statistics in SQL
  30. Chapter 24: Regions, Runs, Gaps, Sequences, and Series
  31. Chapter 25: Arrays in SQL
  32. Chapter 26: Set Operations
  33. Chapter 27: Subsets
  34. Chapter 28: Trees and Hierarchies in SQL
  35. Chapter 29: Temporal Queries
  36. Chapter 30: Graphs in SQL
  37. Chapter 31: OLAP in SQL
  38. Chapter 32: Transactions and Concurrency Control
  39. Chapter 33: Optimizing SQL
  40. References
  41. Index
  42. About the Author