SQL Subqueries

Microsoft Developer Network Documentation

Subqueries, also known as inner queries or nested queries, are SQL queries embedded within another SQL query. They can be used in various parts of a SQL statement, including the SELECT list, FROM clause, WHERE clause, and HAVING clause. Subqueries are powerful tools for performing complex data retrieval and manipulation.

Types of Subqueries

Subqueries can be classified based on the number of rows and columns they return:

Subqueries in the WHERE Clause

This is one of the most common uses of subqueries. They are used to filter rows based on the result of another query.

Using with Comparison Operators

Subqueries can be used with operators like =, !=, >, <, >=, and <= when the subquery returns a single value (scalar subquery).

Example: Find employees earning more than the average salary

SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);

Using with IN and NOT IN

These operators are used when the subquery can return multiple values. They check if a value exists or does not exist within the set of values returned by the subquery.

Example: Find customers who have placed orders

SELECT CustomerID, CustomerName
FROM Customers
WHERE CustomerID IN (SELECT DISTINCT CustomerID FROM Orders);

Using with EXISTS and NOT EXISTS

These operators test for the existence of rows returned by the subquery. They are often more efficient than IN for large result sets.

Example: Find customers who have NOT placed any orders

SELECT CustomerID, CustomerName
FROM Customers c
WHERE NOT EXISTS (SELECT 1 FROM Orders o WHERE o.CustomerID = c.CustomerID);

Using with ANY, SOME, and ALL

These keywords are used with comparison operators to compare a value with any, some, or all values in a list returned by a subquery.

Example: Find products whose price is greater than all prices in the 'Electronics' category

SELECT ProductName, Price
FROM Products
WHERE Price > ALL (
    SELECT Price
    FROM Products
    WHERE Category = 'Electronics'
);
Note: Using ANY is equivalent to SOME.

Subqueries in the SELECT List

When used in the SELECT list, a subquery must be a scalar subquery, returning at most one row and one column. It's typically used to retrieve related information for each row of the outer query.

Example: Display each product and the number of orders it has

SELECT
    ProductName,
    (SELECT COUNT(*)
     FROM OrderDetails od
     WHERE od.ProductID = p.ProductID) AS OrderCount
FROM Products p;

Subqueries in the FROM Clause (Derived Tables)

A subquery in the FROM clause is called a derived table. It's treated as a temporary table that the outer query can query from. Derived tables must be aliased.

Example: Find the average order total per customer

SELECT CustomerID, AVG(OrderTotal) AS AverageOrderValue
FROM (
    SELECT CustomerID, OrderID, SUM(Quantity * UnitPrice) AS OrderTotal
    FROM OrderDetails
    GROUP BY CustomerID, OrderID
) AS CustomerOrderTotals
GROUP BY CustomerID;

Correlated Subqueries

A correlated subquery is a subquery that references columns from the outer query. The subquery is executed once for each row processed by the outer query. This can sometimes lead to performance issues if not used carefully.

Example: Find employees whose salary is greater than the average salary in their department

SELECT e1.FirstName, e1.LastName, e1.Salary, e1.DepartmentID
FROM Employees e1
WHERE e1.Salary > (
    SELECT AVG(e2.Salary)
    FROM Employees e2
    WHERE e2.DepartmentID = e1.DepartmentID
);

Performance Considerations

While subqueries offer flexibility, it's important to consider their performance implications. For complex queries, alternatives like JOIN operations or Common Table Expressions (CTEs) might offer better performance. Always test your queries with realistic data volumes.

Alternatives to Subqueries

Understanding and effectively using subqueries is a fundamental skill for any SQL developer. Experiment with different types and structures to master their application in various scenarios.