1. SQL Features
This chapter discusses a selection of fairly independent SQL features that are of importance for the tuning process, many of which are somewhat poorly advertised. Ill begin with a quick review of just what SQL statements are and the identifiers used to refer to them. My second topic is the array interface that is used to move data from client processes to the database server in large batches. I will then discuss factored subqueries that make reading SQL statements much easier. My fourth and final topic in this first chapter is a review of the different types of inner and outer joins; I will explain how to write them, what they are used for, and why it isnt quite as easy to reorder outer joins as it is to reorder inner joins.
SQL and Declarative Programming Languages
Programs written in a declarative programming language describe what computation should be performed but not how to compute it. SQL is considered a declarative programming language. Compare SQL with imperative programming languages like C, Visual Basic, or even PL/SQL that specify each step of the computation.
This sounds like great news. You write the SQL any way you want and, providing it is semantically correct, somebody or something else will find the optimal way to run it. That something else in our case is the cost-based optimizer (CBO) and in most cases it does a pretty good job. However, despite the theory, there is a strong implication of an algorithm in many SQL statements. Listing 1-1 using the HR example schema is one such example.
Listing 1-1.Subqueries in the SELECT list
SELECT first_name
,last_name
, (SELECT first_name
FROM hr.employees m
WHERE m.employee_id = e.manager_id)
AS manager_first_name
, (SELECT last_name
FROM hr.employees m
WHERE m.employee_id = e.manager_id)
AS manager_last_name
FROM hr.employees e
WHERE manager_id IS NOT NULL
ORDER BY last_name, first_name;
What this statement says is: Obtain the first and last names of each employee with a manager and in each case look up the managers first and last names. Order the resulting rows by employees last and first names . Listing 1-2 appears to be a completely different statement.
Listing 1-2.Use of a join instead of a SELECT list
SELECT e.first_name
,e.last_name
,m.first_name AS manager_first_name
,m.last_name AS manager_last_name
FROM hr.employees e, hr.employees m
WHERE m.employee_id = e.manager_id
ORDER BY last_name, first_name;
This statement says: Perform a self-join on HR.EMPLOYEES keeping only rows where the EMPLOYEE_ID from the first copy matches the MANAGER_ID from the second copy. Pick the names of the employee and the manager and order the results . Despite the apparent difference between Listing 1-1 and Listing 1-2, they both produce identical results. In fact, because EMPLOYEE_ID is the primary key of EMPLOYEES and there is a referential integrity constraint from MANAGER_ID to EMPLOYEE_ID , they are semantically equivalent.
In an ideal world, the CBO would work all this out and execute both statements the same way. In fact, as of Oracle Database 12c, these statements are executed in entirely different ways. Although the CBO is improving from release to release, there will always be some onus on the author of SQL statements to write them in a way that helps the CBO find a well-performing execution plan, or at the very least avoid a completely awful one.
Statements and SQL_IDs
Oracle Database identifies each SQL statement by something referred to as an SQL_ID. Many of the views you use when analyzing SQL performance, such as V$ACTIVE_SESSION_HISTORY , pertain to a specific SQL statement identified by an SQL_ID. It is important that you understand what these SQL_IDs are and how to cross-reference an SQL_ID with the actual text of the SQL statement.
An SQL_ID is a base 32 number represented as a string of 13 characters, each of which may be a digit or one of 22 lowercase letters. An example might be ddzxfryd0uq9t. The letters e, i, l, and o are not used presumably to limit the risk of transcription errors. The SQL_ID is actually a hash generated from the characters in the SQL statement. So assuming that case and whitespace are preserved, the same SQL statement will have the same SQL_ID on any database on which it is used.
Normally the two statements in Listing 1-3 will be considered different.
Listing 1-3.Statements involving literals
SELECT 'LITERAL 1' FROM DUAL;
SELECT 'LITERAL 2' FROM DUAL;
The first statement has an SQL_ID of 3uzuap6svwz7u and the second an SQL_ID of 7ya3fww7bfn89.
Any SQL statement issued inside a PL/SQL block also has an SQL_ID. Such statements may use PL/SQL variables or parameters, but changing the values of variables does not change the SQL_ID. Listing 1-4 shows a similar query to those in Listing 1-3 except it is issued from within a PL/SQL block.
Listing 1-4.A SELECT statement issued from PL/SQL
SET SERVEROUT ON
DECLARE
PROCEDURE check_sql_id (p_literal VARCHAR2)
IS
dummy_variable VARCHAR2 (100);
sql_id v$session.sql_id%TYPE;
BEGIN
SELECT p_literal INTO dummy_variable FROM DUAL ;
SELECT prev_sql_id
INTO sql_id
FROM v$session
WHERE sid = SYS_CONTEXT ('USERENV', 'SID');
DBMS_OUTPUT.put_line (sql_id);
END check_sql_id;
BEGIN
check_sql_id ('LITERAL 1');
check_sql_id ('LITERAL 2');
END;
/
d8jhv8fcm27kd
d8jhv8fcm27kd
PL/SQL procedure successfully completed.
This anonymous block includes two calls to a nested procedure that takes a VARCHAR2 string as a parameter. The procedure calls a SELECT statement and then obtains the SQL_ID of that statement from the PREV_SQL_ID column of V$SESSION and outputs it. The procedure is called with the same two literals as were used in Listing 1-3. However, the output shows that the same SQL_ID, d8jhv8fcm27kd, was used in both cases. In fact, PL/SQL modifies the SELECT statement slightly before submitting it to the SQL engine. Listing 1-5 shows the underlying SQL statement after the PL/SQL specific INTO clause has been removed.
Listing 1-5.An SQL statement with a bind variable
SELECT :B1 FROM DUAL
The :B 1 bit is what is known as a bind variable , and it is used in PL/SQL whenever a variable or parameter is used. Bind variables are also used when SQL is invoked from other programming languages. This bind variable is just a placeholder for an actual value, and it indicates that the same statement can be reused with different values supplied for the placeholder. I will explain the importance of this as I go on.
Cross-Referencing Statement and SQL_ID
If you have access to the SYS account of a database running 11.2 or later, you can use the approach in Listing 1-6 to identify the SQL_ID of a statement.
Listing 1-6.Using DBMS_SQLTUNE_UTIL0 to determine the SQL_ID of a statement
SELECT sys.dbms_sqltune_util0.sqltext_to_sqlid (
q'[SELECT 'LITERAL 1' FROM DUAL]' || CHR (0))
FROM DUAL;
The result of the query in Listing 1-6 is 3uzuap6svwz7u, the SQL_ID of the first statement in Listing 1-3. There are a few observations that can be made about Listing 1-6: