Why Joins & Aggregations Matter
In real companies, data is never in one table. You'll have:
- Customers table
- Orders table
- Products table
- Payments table
Joins let you combine them. Aggregations let you calculate totals, averages, and counts.
Real-world: "What's the total revenue per customer in Q1?" requires both joins AND aggregations.
Part 1: Aggregate Functions
Calculate summaries across rows.
COUNT โ Count Rows
SELECT COUNT(*) AS total_orders
FROM orders;Count non-null values in a column:
SELECT COUNT(phone) AS customers_with_phone
FROM customers;SUM โ Add Up Values
SELECT SUM(amount) AS total_revenue
FROM orders;AVG โ Calculate Average
SELECT AVG(amount) AS average_order_value
FROM orders;MIN / MAX โ Find Extremes
SELECT
MIN(amount) AS smallest_order,
MAX(amount) AS largest_order
FROM orders;GROUP BY โ Group Data Before Aggregating
GROUP BY splits data into groups and calculates aggregates for each group.
Example: Revenue per City
SELECT
city,
SUM(amount) AS total_revenue
FROM orders
GROUP BY city;Result: | city | total_revenue | |------|---------------| | Mumbai | 500000 | | Delhi | 350000 |
Example: Count Orders per Customer
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;Multiple Columns in GROUP BY
SELECT
city,
product_category,
SUM(amount) AS revenue
FROM orders
GROUP BY city, product_category
ORDER BY city, revenue DESC;HAVING โ Filter After Grouping
WHERE filters rows BEFORE grouping. HAVING filters groups AFTER aggregation.
Example: Cities with Revenue > โน100K
SELECT
city,
SUM(amount) AS total_revenue
FROM orders
GROUP BY city
HAVING SUM(amount) > 100000;WHERE vs HAVING
SELECT
city,
COUNT(*) AS order_count
FROM orders
WHERE status = 'Delivered' -- Filter rows first
GROUP BY city
HAVING COUNT(*) > 50; -- Then filter groupsPart 2: SQL Joins
The Problem
customers table: | customer_id | name | |-------------|------| | 1 | Rahul | | 2 | Priya |
orders table: | order_id | customer_id | amount | |----------|-------------|--------| | 101 | 1 | 5000 | | 102 | 1 | 3000 | | 103 | 2 | 2000 |
Question: Show customer names with their order amounts.
Solution: JOIN the tables!
INNER JOIN โ Most Common
Returns only rows that have matches in BOTH tables.
SELECT
customers.name,
orders.amount
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id;Result: | name | amount | |------|--------| | Rahul | 5000 | | Rahul | 3000 | | Priya | 2000 |
Using Aliases for Shorter Syntax
SELECT
c.name,
o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;LEFT JOIN โ Keep All Left Table Rows
Returns ALL rows from the left table, even if no match exists.
SELECT
c.name,
o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;Result: | name | amount | |------|--------| | Rahul | 5000 | | Rahul | 3000 | | Priya | 2000 | | Amit | NULL | โ Customer with no orders
When to use: Find customers who haven't ordered, employees without departments, etc.
RIGHT JOIN โ Keep All Right Table Rows
Same as LEFT JOIN, but keeps all rows from the right table.
SELECT
c.name,
o.amount
FROM customers c
RIGHT JOIN orders o ON c.customer_id = o.customer_id;Multiple Joins
SELECT
c.name,
o.order_id,
p.product_name,
o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN products p ON o.product_id = p.product_id;Part 3: Subqueries
A query inside another query.
Example: Orders Above Average
SELECT *
FROM orders
WHERE amount > (SELECT AVG(amount) FROM orders);Subquery in FROM Clause
SELECT
city,
avg_revenue
FROM (
SELECT
city,
AVG(amount) AS avg_revenue
FROM orders
GROUP BY city
) AS city_stats
WHERE avg_revenue > 50000;Real-World Example
Goal: Find top 5 customers by total spend in 2026.
SELECT
c.name,
COUNT(o.order_id) AS total_orders,
SUM(o.amount) AS total_spent
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE YEAR(o.order_date) = 2026
GROUP BY c.customer_id, c.name
ORDER BY total_spent DESC
LIMIT 5;Summary
โ Aggregate functions: COUNT, SUM, AVG, MIN, MAX โ GROUP BY groups data before aggregation โ HAVING filters groups after aggregation โ INNER JOIN returns matching rows from both tables โ LEFT JOIN keeps all left table rows โ Multiple joins combine 3+ tables โ Subqueries nest queries for complex logic
Next Topic: Python Basics for Analysts ๐