This page is intentionally left blank.
INTRODUCTION Developers across the world are facing database issues daily .
While they are immersed in procedural languages with loops , RDBMS forces them to think in terms of sets without loops. It takes transition. It takes training. It takes experience . Developers are exposed also to Excel worksheets or spreadsheets as they were called in the not so distant past. So if you know worksheets how hard databases can be? After all worksheets look pretty much like database table s? The big difference is connections among well-designed tables.
A database is a set of connected tables which represent entities in the real world . A database can be 100 connected tables or 3000. The connection is very simple: row A in table Alpha has affiliated data with row B in table Beta. But even with 00 tables and 3 00 connections (FOREIGN KEY references) , it takes a good amount of time to familiarize to the point of acceptable working knowledge. " The Cemetery of Computer Languages " is expanding. You can see tombstones like PL/1, Forth, Ada, Pascal, LISP, RPG, APL, SNOBOL, JOVIAL, Algol and the list goes on.
For some , the future is in question : PowerBuilder, ColdFusion , FORTRAN & COBOL. SQL on the other hand running strong after 3 decades of glorious existence. What is the difference? The basic difference is that SQL can handle large datasets in a consistent manner based on mathematical foundations. You can throw together a computer language easy: assignment statements, looping, if-then conditional, 300 library function s, and voila! Here is the new language: Mars/1, named after the red planet to be fashionable with NASA's new Mars robot. But can Mars/1 JOIN a table of 1 million rows with a table of 10 million rows in a second ? The success of SQL language is so compelling that other technologies are tagged on to it like XML/XQ uery which deals with semi-structured information objects. In SQL you are thinking at a high level.
In C# or Java , you are dealing with details, lots of them. That is the big difference. Why is so much of the book dedicated to database design? Why not plunge into SQL coding and sooner or later the developer will get a hang of the design? Because high level thinking requires thinking at the database design level. A farmer has 6 mules, how do we model it in the database? We design the Farmer and FarmAnimal tables, then connect them with FarmerID FOREIGN KEY in FarmAnimal referencing the FarmerID PRIMARY KEY in the Farmer table. What is the big deal about it, looks so simple? In fact , how about just calling the tables Table1 & Table2 to be more generic ? Ouch... meaningful naming is the very basis of good database design.
Relational database design is truly simple for simple well-understood models. The challenge starts in modeling complex objects such as financial derivative instruments, airplane passenger scheduling or social network website. When you need to add 5 new tables to a 1000 tables database and hook them in (define FOREIGN KEY references) correctly, it is a huge challenge. To begin with, some of the 5 new tables may already be redundant, but you don't know that until you understand what the 1000 tables are really storing . Frequently, learning the application area is the biggest challenge for a developer when starting a new job. The SQL language is simple t o program and read even if when touching 10 tables.
Complexities are abound though. The very first one: does the SQL statement touch the right data set? 999 records and 1000 or 998? T-SQL statements are turned into Transact-SQL scripts, stored procedures, user-defined functions and triggers , server-side database objects . They can be 5 statements or 1000 statements long programs. The style of Transact-SQL programming is different from the style in procedural programming languages. There are no arrays, only tables or table variables. Typically there is no looping, only set-based operations.