Aggregate functions perform a calculation on a set of values and return a single scalar value. They are often used to summarize data. These functions can be used in the SELECT
list, ORDER BY
clause, and in HAVING
clauses of queries.
Standard Aggregate Functions
These are the most commonly used aggregate functions in SQL:
Function | Description | Syntax Example |
---|---|---|
COUNT() |
Returns the number of rows in a specified column or the total number of rows in a result set. | SELECT COUNT(column_name) FROM table_name; SELECT COUNT(*) FROM table_name; |
SUM() |
Returns the sum of all the values in a numeric column. | SELECT SUM(column_name) FROM table_name; |
AVG() |
Returns the average value of a numeric column. | SELECT AVG(column_name) FROM table_name; |
MIN() |
Returns the smallest value in a column. | SELECT MIN(column_name) FROM table_name; |
MAX() |
Returns the largest value in a column. | SELECT MAX(column_name) FROM table_name; |
Example Usage:
-- Get the total number of products
SELECT COUNT(*) AS TotalProducts
FROM Production.Product;
-- Get the average list price of products
SELECT AVG(ListPrice) AS AverageListPrice
FROM Production.Product;
-- Get the highest salary from the Employee table
SELECT MAX(Salary) AS HighestSalary
FROM HumanResources.Employee;
Advanced Aggregate Functions
SQL Server also provides more advanced aggregate functions for specific use cases:
Function | Description | Syntax Example |
---|---|---|
STDEV() , STDEVP() |
Calculates the statistical standard deviation. | SELECT STDEV(numeric_column) FROM table_name; |
VAR() , VARP() |
Calculates the statistical variance. | SELECT VAR(numeric_column) FROM table_name; |
CHECKSUM_AGG() |
Computes the checksum of a set of values. | SELECT CHECKSUM_AGG(binary_column) FROM table_name; |
Using with GROUP BY
The GROUP BY
clause is used in conjunction with aggregate functions to group rows that have the same values in one or more columns into a summary row, like "find the average salary for each department".
SELECT DepartmentID, AVG(Salary) AS AverageSalary
FROM HumanResources.Employee
GROUP BY DepartmentID;
Using with HAVING
The HAVING
clause is used to filter groups based on a specified condition after the aggregation has been performed. It's like a WHERE
clause for groups.
SELECT DepartmentID, AVG(Salary) AS AverageSalary
FROM HumanResources.Employee
GROUP BY DepartmentID
HAVING AVG(Salary) > 50000; -- Only show departments with an average salary over 50000
DISTINCT Keyword
The DISTINCT
keyword can be used with aggregate functions like COUNT()
to count only the unique values in a column.
-- Count the number of unique job titles
SELECT COUNT(DISTINCT JobTitle) AS NumberOfUniqueJobTitles
FROM HumanResources.Employee;
NULL Handling
Most aggregate functions, except for COUNT(*)
, ignore NULL
values in the column being aggregated. COUNT(*)
counts all rows, including those with NULL
values.
SUM()
, AVG()
, MIN()
, MAX()
, STDEV()
, VAR()
ignore NULL
values.
Performance Considerations
- Aggregate functions can be resource-intensive, especially on large tables.
- Ensure appropriate indexes are in place for columns used in
WHERE
,GROUP BY
, andORDER BY
clauses. - Consider pre-aggregating data if queries are run frequently and the underlying data doesn't change rapidly.