SQL GROUP BY Queries

The GROUP BY clause in SQL is used to arrange identical data into groups. It is often used with aggregate functions (like COUNT, MAX, MIN, SUM, and AVG) to perform calculations on each group.

Purpose of GROUP BY

When you want to retrieve summarized information from your database based on specific criteria, GROUP BY is your tool. For example, if you have a table of sales and you want to know the total sales amount for each product category, you would use GROUP BY to group the sales by category and then use the SUM aggregate function.

Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
ORDER BY column1;
        

Explanation:

Example Scenario

Let's consider a table named Orders:

OrderID CustomerID OrderDate TotalAmount
1 101 2023-01-15 150.75
2 102 2023-01-16 85.50
3 101 2023-01-17 210.00
4 103 2023-01-18 55.20
5 102 2023-01-19 120.00
6 101 2023-01-20 75.00

Example 1: Count of Orders per Customer

To find out how many orders each customer has placed:

SELECT CustomerID, COUNT(OrderID) AS NumberOfOrders
FROM Orders
GROUP BY CustomerID;
        

Result:

CustomerID NumberOfOrders
101 3
102 2
103 1

Example 2: Total Amount Spent per Customer

To calculate the total amount spent by each customer:

SELECT CustomerID, SUM(TotalAmount) AS TotalSpent
FROM Orders
GROUP BY CustomerID;
        

Result:

CustomerID TotalSpent
101 435.75
102 205.50
103 55.20

GROUP BY with WHERE

You can combine WHERE with GROUP BY to filter rows before they are grouped. For instance, to find the total amount spent by each customer only for orders placed after '2023-01-16':

SELECT CustomerID, SUM(TotalAmount) AS TotalSpent
FROM Orders
WHERE OrderDate > '2023-01-16'
GROUP BY CustomerID;
        

Result:

CustomerID TotalSpent
101 285.00
102 120.00

HAVING Clause

The HAVING clause is used to filter groups based on a specified condition, similar to how WHERE filters rows. You cannot use WHERE with aggregate functions.

Example 3: Customers with More Than 1 Order

To find customers who have placed more than 1 order:

SELECT CustomerID, COUNT(OrderID) AS NumberOfOrders
FROM Orders
GROUP BY CustomerID
HAVING COUNT(OrderID) > 1;
        

Result:

CustomerID NumberOfOrders
101 3
102 2

Key Distinction: WHERE vs. HAVING

Grouping by Multiple Columns

You can group rows based on the unique combination of values in multiple columns.

Example 4: Orders per Customer per Day

Suppose we have a Sales table with CustomerID, SaleDate, and Amount. To find the total sales for each customer on each specific date:

SELECT CustomerID, SaleDate, SUM(Amount) AS DailyTotal
FROM Sales
GROUP BY CustomerID, SaleDate
ORDER BY CustomerID, SaleDate;
        

Best Practices