Contents
SQL and Relational Theory
How to Write Accurate SQL Code
THIRD EDITION
C. J. Date
SQL and Relational Theory: How to Write Accurate SQL Code (3rd edition)
by C. J. Date
Copyright 2015 C. J. Date. All rights reserved.
Printed in the United States of America.
Published by OReilly Media, Inc.,
1005 Gravenstein Highway North, Sebastopol, CA 95472
OReilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (.
Printing History:
January 2009: First Edition.
December 2011: Second Edition.
October 2015: Third Edition.
Revision History:
2015-09-30 First release.
See http://www.oreilly.com/catalog/errata.csp?isbn=0636920046158 for release details.
The OReilly logo is a registered trademark of OReilly Media, Inc. SQL and Relational Theory: How to Write Accurate SQL Code and related trade dress are trademarks of OReilly Media, Inc.
While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
ISBN: 978-1-491-94117-1
[LSI]
Those who are enamored of practice without theory are like a pilot who goes into a ship without rudder or compass and never has any certainty where he is going. Practice should always be based upon a sound knowledge of theory.
Leonardo da Vinci (1452-1519)
The trouble with people is not that they dont know but that they know so much that aint so.
Josh Billings (1818-1885)
Languages die ... mathematical ideas do not.
G. H. Hardy (1877-1947)
Unfortunately, the gap between theory and practice is not as wide in theory as it is in practice.
Anon.
These are my principles.
If you dont like them, I have others.
Groucho Marx (1890-1977)
There is no royal road to geometry.
Euclid (c. 365-275 BCE), attrib.
To all those who think an exercise like this one is worthwhile, and in particular to the memory of Lex de Haan, who is very much missed
About the Author
C. J. Date is an independent author, lecturer, researcher, and consultant, specializing in relational database technology. He is best known for his book An Introduction to Database Systems (8th edition, Addison-Wesley, 2004), which has sold some 900,000 copies at the time of writing and is used by several hundred colleges and universities worldwide. He is also the author of numerous other books on database management, including most recently:
From Ventus: Go Faster! The TransRelational Approach to DBMS Implementation (2002, 2011)
From Addison-Wesley: Databases, Types, and the Relational Model: The Third Manifesto (3rd edition, with Hugh Darwen, 2007)
From Trafford: Logic and Databases: The Roots of Relational Theory (2007) and Database Explorations: Essays on The Third Manifesto and Related Topics (with Hugh Darwen, 2010)
From Apress: Date on Database: Writings 2000-2006 (2007)
From OReilly: Database Design and Relational Theory: Normal Forms and All That Jazz (2012); View Updating and Relational Theory: Solving the View Update Problem (2013); Relational Theory for Computer Professionals: What Relational Databases Are Really All About (2013); and The New Relational Database Dictionary (2015, to appear)
Mr. Date was inducted into the Computing Industry Hall of Fame in 2004. He enjoys a reputation that is second to none for his ability to explain complex technical subjects in a clear and understandable fashion.
Preface to the First Edition
SQL is ubiquitous. But SQL is hard to use: Its complicated, confusing, and error prone (much more so, I venture to suggest, than its apologists would have you believe). In order to have any hope of writing SQL code that you can be sure is accurate, thereforemeaning code that does exactly what its supposed to do, no more and no lessyou must follow some appropriate discipline. And its the thesis of this book that using SQL relationally is the discipline you need. But what does this mean? Isnt SQL relational anyway?
Well, its true that SQL is the standard language for use with relational databasesbut that fact in itself doesnt make it relational. The sad truth is, SQL departs from relational theory in all too many ways; duplicate rows and nulls are two obvious examples, but theyre not the only ones. As a consequence, the language gives you rope to hang yourself with, as it were. So if you dont want to hang yourself, you need to understand relational theory (what it is and why); you need to know about SQLs departures from that theory; and you need to know how to avoid the problems they can cause. In a word, you need to use SQL relationally. Then you can behave as if SQL truly were relational, more or less, and you can enjoy the benefits of working with what is in effect a truly relational system.
Now, a book like this wouldnt be needed if everyone was using SQL relationally alreadybut they arent. On the contrary, I observe much bad practice in current SQL usage. I even observe such practice being recommended, in textbooks and similar publications, by writers who really ought to know better (no names, no pack drill); in fact, a review of the literature in this regard is a pretty dispiriting exercise. The relational model first saw the light of day in 1969, and yet here we are, over 45 years later, and it still doesnt seem to be very well understood by the database community at large. Partly for such reasons, this book uses the relational model itself as an organizing principle; it explains various features of the model in depth, and shows in every case how best to use SQL in order to comply with the feature in question.
Prerequisites
I assume youre a database practitioner and therefore reasonably familiar with SQL already. To be specific, I assume you have a working knowledge of either the SQL standard or (perhaps more likely in practice) at least one SQL product. However, I dont assume you have a deep knowledge of relational theory as suchthough I do hope you understand that the relational model is a good thing in general, and adherence to it wherever possible is a desirable goal. In order to avoid misunderstandings, therefore, Ill be describing various features of the relational model in detail, as well as showing how to use SQL to conform to those features. But what I wont do is attempt to justify all of those features; rather, Ill assume youre sufficiently experienced in database matters to understand why, e.g., the notion of a key makes sense, or why you sometimes need to do a join, or why many to many relationships need to be supported. (If I were to include such justifications, this would be a very different bookquite apart from anything else, it would be much bigger than it already isand in any case, that book has already been written.)