Working with Multiple Rows
Aggregation: Summarizing data
So far, you've learned to retrieve individual rows. But often, you want to summarize data across multiple rows: "How many users do we have?" or "What's the total amount spent?"
This is where aggregation functions come in.
Aggregation functions return a single value from multiple rows.
Count: How many rows?
COUNT returns the number of rows:
-- Count all rows
SELECT COUNT(*) FROM users;
-- Count non-NULL values in a column
SELECT COUNT(email) FROM users;
Important: COUNT(*) counts all rows, including those with NULL values. COUNT(column) counts only non-NULL values.
Sum: Adding up numbers
SUM adds up all values in a numeric column:
SELECT SUM(amount) FROM expenses;
This returns the total of all amount values.
Avg: Average values
AVG calculates the average (mean) of a numeric column:
SELECT AVG(amount) FROM expenses;
Min and max: Finding extremes
SELECT MIN(amount) FROM expenses; -- Smallest amount
SELECT MAX(amount) FROM expenses; -- Largest amount
Group by: Aggregating by groups
GROUP BY lets you aggregate data within groups. Instead of one summary for the entire table, you get one summary per group.
SELECT category, COUNT(*)
FROM expenses
GROUP BY category;
This returns:
category | count
--------------|------
food | 15
transport | 8
subscription | 3
Each row represents one category and its count.
How GROUP BY works
When you use GROUP BY, SQL:
- Groups rows that have the same value in the grouped column(s)
- Applies the aggregation function to each group
- Returns one row per group
SELECT category, SUM(amount)
FROM expenses
GROUP BY category;
This gives you the total amount spent in each category.
Multiple columns in GROUP BY
You can group by multiple columns:
SELECT user_id, category, COUNT(*)
FROM expenses
GROUP BY user_id, category;
This groups by both user_id and category, giving you counts for each user-category combination.
Having: Filtering groups
WHERE filters rows before grouping. HAVING filters groups after aggregation.
SELECT category, COUNT(*)
FROM expenses
GROUP BY category
HAVING COUNT(*) > 5;
This returns only categories that have more than 5 expenses.
Key difference:
WHEREfilters rows (before aggregation)HAVINGfilters groups (after aggregation)
-- WHERE: filter rows first
SELECT category, COUNT(*)
FROM expenses
WHERE amount > 10 -- Filter rows
GROUP BY category;
-- HAVING: filter groups after
SELECT category, COUNT(*)
FROM expenses
GROUP BY category
HAVING COUNT(*) > 5; -- Filter groups
Complete example
SELECT category, COUNT(*), SUM(amount), AVG(amount)
FROM expenses
WHERE created_at > '2024-01-01' -- Filter rows
GROUP BY category
HAVING COUNT(*) > 5 -- Filter groups
ORDER BY SUM(amount) DESC; -- Sort results
This query:
- Filters to expenses after 2024-01-01
- Groups by category
- Calculates count, sum, and average for each category
- Keeps only categories with more than 5 expenses
- Sorts by total amount (highest first)
When aggregation happens
Understanding the order of operations helps you write correct queries:
FROM → Which table?
WHERE → Filter rows
GROUP BY → Group rows
HAVING → Filter groups
SELECT → Choose columns/aggregations
ORDER BY → Sort results
LIMIT → Limit results
Aggregation happens after WHERE but before ORDER BY.
Common mistakes
Forgetting GROUP BY
-- ❌ Wrong: mixing aggregated and non-aggregated columns
SELECT category, COUNT(*)
FROM expenses;
-- Error: category must appear in GROUP BY
-- ✅ Correct: group by category
SELECT category, COUNT(*)
FROM expenses
GROUP BY category;
Rule: If you use an aggregation function (COUNT, SUM, etc.), all non-aggregated columns must be in GROUP BY.
Using WHERE instead of HAVING
-- ❌ Wrong: can't use WHERE with aggregations
SELECT category, COUNT(*)
FROM expenses
WHERE COUNT(*) > 5
GROUP BY category;
-- ✅ Correct: use HAVING for group filters
SELECT category, COUNT(*)
FROM expenses
GROUP BY category
HAVING COUNT(*) > 5;
NULL in GROUP BY
NULL values form their own group:
SELECT category, COUNT(*)
FROM expenses
GROUP BY category;
If some expenses have category = NULL, they'll all be grouped together.
Practical patterns
Top N per group
-- Top 3 expenses per category
SELECT category, amount, description
FROM expenses
ORDER BY category, amount DESC
LIMIT 3;
-- Note: This is simplified; true "top N per group" requires window functions
Percentage calculations
SELECT
category,
COUNT(*) as count,
COUNT(*) * 100.0 / (SELECT COUNT(*) FROM expenses) as percentage
FROM expenses
GROUP BY category;