MSDN Documentation

Microsoft Developer Network

Understanding SQL Syntax

This document provides a comprehensive overview of the fundamental syntax concepts used in SQL (Structured Query Language). SQL is the standard language for managing and manipulating relational databases. Understanding its syntax is crucial for anyone working with databases, from developers and administrators to analysts.

We will cover the core building blocks of SQL statements, including data types, operators, functions, and common clauses, along with best practices for writing clear and efficient queries.

Basic Statement Structure

Most SQL statements follow a declarative structure, meaning you specify what you want to achieve, and the database system figures out how to do it. The most common type of statement is a Data Manipulation Language (DML) statement, such as a query.

A typical SQL query involves:

  • SELECT: Specifies the columns to retrieve.
  • FROM: Specifies the table(s) to retrieve data from.
  • WHERE: Filters records based on a condition.
  • GROUP BY: Groups rows that have the same values in specified columns.
  • HAVING: Filters groups based on a condition.
  • ORDER BY: Sorts the result set.

Example of a simple query:

SELECT CustomerName, City
FROM Customers
WHERE Country = 'USA'
ORDER BY CustomerName;

SQL Data Types

Data types define the kind of data that can be stored in a column and how it is interpreted. Common SQL data types include:

Type Category Common Types Description
Numeric INT, DECIMAL(p,s), FLOAT Whole numbers, fixed-point numbers, floating-point numbers.
String/Text VARCHAR(n), CHAR(n), TEXT Variable-length strings, fixed-length strings, long text.
Date and Time DATE, TIME, DATETIME, TIMESTAMP Date values, time values, date and time values.
Boolean BIT, BOOLEAN True/False values.
Binary BLOB, BINARY Binary large objects, fixed-length binary data.

The specific names and availability of data types can vary slightly between different SQL database systems (e.g., SQL Server, PostgreSQL, MySQL).

SQL Operators

Operators are symbols or keywords that perform operations on one or more expressions. They are used in SQL statements, especially in the WHERE clause.

Arithmetic Operators

Operator Description
+, -, *, / Addition, Subtraction, Multiplication, Division
% Modulo (remainder of division)

Comparison Operators

Operator Description
=, <> (or !=), >, <, >=, <= Equal to, Not equal to, Greater than, Less than, Greater than or equal to, Less than or equal to
BETWEEN Checks if a value is within a range.
LIKE Searches for a specified pattern in a column.
IN Checks if a value matches any value in a list.

Logical Operators

Operator Description
AND Combines two conditions; both must be true.
OR Combines two conditions; at least one must be true.
NOT Reverses the result of a condition.

Example using operators:

SELECT ProductName, Price
FROM Products
WHERE Price BETWEEN 50 AND 100 AND Category LIKE 'Elect%';

Built-in Functions

SQL provides a rich set of built-in functions to perform calculations, manipulate strings, handle dates, and more. These functions can be used in SELECT lists, WHERE clauses, and other parts of SQL statements.

Common Function Categories:

  • Aggregate Functions: Perform a calculation on a set of values and return a single value. Examples: COUNT(), SUM(), AVG(), MIN(), MAX().
  • Scalar Functions: Operate on a single value and return a single value. Examples: UPPER(), LOWER(), LENGTH(), SUBSTRING(), GETDATE() (or equivalent).
  • String Functions: Manipulate text data. Examples: CONCAT(), REPLACE(), TRIM().
  • Date Functions: Manipulate date and time values. Examples: DATE_ADD(), DATEDIFF(), FORMAT().

Example using functions:

SELECT COUNT(*) AS TotalOrders, SUM(Amount) AS TotalRevenue
FROM Orders
WHERE OrderDate >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR);

Common SQL Clauses

Clauses are keywords that form the structure of SQL statements. Some fundamental clauses have been introduced, but here are a few more that are frequently used:

  • JOIN Clauses: Used to combine rows from two or more tables based on a related column between them. Types include INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN.
  • UNION: Combines the result sets of two or more SELECT statements into a single result set. It removes duplicate rows by default.
  • UNION ALL: Similar to UNION but includes all rows, including duplicates.
  • DISTINCT: Used in a SELECT statement to return only unique values.

Example of a JOIN clause:

SELECT o.OrderID, c.CustomerName
FROM Orders o
INNER JOIN Customers c ON o.CustomerID = c.CustomerID;

Identifiers and Literals

Understanding how to refer to database objects and literal values is key.

Identifiers:

Identifiers are names of database objects such as tables, columns, views, indexes, stored procedures, and functions. They typically follow specific naming rules:

  • Must start with a letter.
  • Can contain letters, numbers, and underscores (_).
  • Should avoid using SQL reserved keywords.
  • Database systems often support quoted identifiers (e.g., "My Table" or [My Column]) to allow spaces or special characters, but this is generally discouraged.

Literals:

Literals are fixed values of a specific data type:

  • String Literals: Enclosed in single quotes ('), e.g., 'Hello World'.
  • Numeric Literals: Numbers without quotes, e.g., 123, 45.67.
  • Date/Time Literals: Format varies by database, but often enclosed in single quotes, e.g., '2023-10-27', '2023-10-27 10:30:00'.
  • Boolean Literals: TRUE, FALSE.

SQL Comments

Comments are used to explain SQL code. They are ignored by the database engine.

  • Single-line comments: Start with two hyphens (--).
  • Multi-line comments: Enclosed between /* and */.
-- This is a single-line comment.
SELECT CustomerName, City
FROM Customers; /* This is a
                multi-line comment. */

Best Practices for SQL Syntax

Tip: Readability Counts

Write SQL that is easy to read and understand. Use consistent formatting, indentation, and capitalization. Use meaningful names for aliases.

  • Use consistent indentation: Align clauses and sub-clauses.
  • Capitalize keywords: Make keywords like SELECT, FROM, WHERE stand out.
  • Use meaningful aliases: For tables and columns, especially in joins.
  • Avoid ambiguous column references: Prefix column names with table names or aliases when querying multiple tables.
  • Be mindful of SELECT *: In production code, it's generally better to explicitly list the columns you need.
  • Use WHERE clauses effectively: Filter data as early as possible.
  • Consider performance implications: Complex queries or poorly indexed tables can lead to slow performance.
  • Use comments liberally: Explain complex logic or non-obvious code.

A well-formatted query:

SELECT
    c.CustomerID,
    c.CustomerName,
    COUNT(o.OrderID) AS NumberOfOrders
FROM
    Customers AS c
JOIN
    Orders AS o ON c.CustomerID = o.CustomerID
WHERE
    c.Country = 'Germany'
GROUP BY
    c.CustomerID, c.CustomerName
HAVING
    COUNT(o.OrderID) > 5
ORDER BY
    NumberOfOrders DESC;