GROUP BY Clause (Transact-SQL)
The GROUP BY clause is used in a SELECT statement to group rows that have the same values in specified columns into a summary row, such as to get the sum, average, or count of values in each group. This clause is often used with aggregate functions.
Syntax
SELECT column_name(s), aggregate_function(column_name)
FROM table_name
WHERE condition
GROUP BY column_name(s)
[ORDER BY column_name(s)];
Description
The GROUP BY clause groups rows that have the same value in one or more columns. For each group, it returns a single row that summarizes the data in that group. This is typically used with aggregate functions like COUNT, MAX, MIN, SUM, and AVG.
Key Concepts:
- Aggregation:
GROUP BYis intrinsically linked to aggregate functions. You usually select the grouping columns and apply aggregate functions to other columns. - Grouping Columns: All columns in the
SELECTlist that are not aggregate functions must be included in theGROUP BYclause. - ORDER BY: While not required, the
ORDER BYclause is often used withGROUP BYto sort the resulting grouped data.
Example 1: Counting Employees by Department
This example shows how to count the number of employees in each department.
SELECT department_id, COUNT(*) AS EmployeeCount
FROM Employees
GROUP BY department_id
ORDER BY department_id;
This query will return a result set with two columns: department_id and EmployeeCount. Each row will represent a unique department and the number of employees within it.
Example 2: Calculating Average Salary by Job Title
This example demonstrates calculating the average salary for each distinct job title.
SELECT job_title, AVG(salary) AS AverageSalary
FROM Employees
GROUP BY job_title
ORDER BY AverageSalary DESC;
This query aggregates employees by their job_title and computes the average of their salary, sorting the results by the calculated average salary in descending order.
Example 3: Grouping by Multiple Columns
You can group by multiple columns to create more granular summaries.
SELECT product_category, product_color, COUNT(*) AS ProductCount
FROM Products
GROUP BY product_category, product_color
ORDER BY product_category, product_color;
This query groups products first by product_category and then by product_color within each category, providing a count for each unique combination.
Important Considerations
- If a
SELECTstatement includes aGROUP BYclause, then any column in theSELECTlist that is not an aggregate function must be included in theGROUP BYclause. - The
HAVINGclause is used to filter groups based on a specified condition, similar to how theWHEREclause filters individual rows.
For more detailed information on aggregate functions and advanced grouping techniques, please refer to the related documentation.