Topic 5 of 12

SQL: Joins, Aggregations & Subqueries

This is where SQL gets powerful. Combine tables, calculate totals, and answer complex business questions.

๐Ÿ“šIntermediate
โฑ๏ธ20 min
โœ…7 quizzes

Why Joins & Aggregations Matter

In real companies, data is never in one table. You'll have:

  • Customers table
  • Orders table
  • Products table
  • Payments table

Joins let you combine them. Aggregations let you calculate totals, averages, and counts.

Real-world: "What's the total revenue per customer in Q1?" requires both joins AND aggregations.


Part 1: Aggregate Functions

Calculate summaries across rows.

COUNT โ€” Count Rows

query.sqlSQL
SELECT COUNT(*) AS total_orders
FROM orders;

Count non-null values in a column:

query.sqlSQL
SELECT COUNT(phone) AS customers_with_phone
FROM customers;

SUM โ€” Add Up Values

query.sqlSQL
SELECT SUM(amount) AS total_revenue
FROM orders;

AVG โ€” Calculate Average

query.sqlSQL
SELECT AVG(amount) AS average_order_value
FROM orders;

MIN / MAX โ€” Find Extremes

query.sqlSQL
SELECT 
  MIN(amount) AS smallest_order,
  MAX(amount) AS largest_order
FROM orders;

GROUP BY โ€” Group Data Before Aggregating

GROUP BY splits data into groups and calculates aggregates for each group.

Example: Revenue per City

query.sqlSQL
SELECT 
  city,
  SUM(amount) AS total_revenue
FROM orders
GROUP BY city;

Result: | city | total_revenue | |------|---------------| | Mumbai | 500000 | | Delhi | 350000 |

Example: Count Orders per Customer

query.sqlSQL
SELECT 
  customer_id,
  COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;

Multiple Columns in GROUP BY

query.sqlSQL
SELECT 
  city,
  product_category,
  SUM(amount) AS revenue
FROM orders
GROUP BY city, product_category
ORDER BY city, revenue DESC;

HAVING โ€” Filter After Grouping

WHERE filters rows BEFORE grouping. HAVING filters groups AFTER aggregation.

Example: Cities with Revenue > โ‚น100K

query.sqlSQL
SELECT 
  city,
  SUM(amount) AS total_revenue
FROM orders
GROUP BY city
HAVING SUM(amount) > 100000;

WHERE vs HAVING

query.sqlSQL
SELECT 
  city,
  COUNT(*) AS order_count
FROM orders
WHERE status = 'Delivered'  -- Filter rows first
GROUP BY city
HAVING COUNT(*) > 50;        -- Then filter groups

Part 2: SQL Joins

The Problem

customers table: | customer_id | name | |-------------|------| | 1 | Rahul | | 2 | Priya |

orders table: | order_id | customer_id | amount | |----------|-------------|--------| | 101 | 1 | 5000 | | 102 | 1 | 3000 | | 103 | 2 | 2000 |

Question: Show customer names with their order amounts.

Solution: JOIN the tables!


INNER JOIN โ€” Most Common

Returns only rows that have matches in BOTH tables.

query.sqlSQL
SELECT 
  customers.name,
  orders.amount
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id;

Result: | name | amount | |------|--------| | Rahul | 5000 | | Rahul | 3000 | | Priya | 2000 |

Using Aliases for Shorter Syntax

query.sqlSQL
SELECT 
  c.name,
  o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;

LEFT JOIN โ€” Keep All Left Table Rows

Returns ALL rows from the left table, even if no match exists.

query.sqlSQL
SELECT 
  c.name,
  o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

Result: | name | amount | |------|--------| | Rahul | 5000 | | Rahul | 3000 | | Priya | 2000 | | Amit | NULL | โ† Customer with no orders

When to use: Find customers who haven't ordered, employees without departments, etc.


RIGHT JOIN โ€” Keep All Right Table Rows

Same as LEFT JOIN, but keeps all rows from the right table.

query.sqlSQL
SELECT 
  c.name,
  o.amount
FROM customers c
RIGHT JOIN orders o ON c.customer_id = o.customer_id;

Multiple Joins

query.sqlSQL
SELECT 
  c.name,
  o.order_id,
  p.product_name,
  o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN products p ON o.product_id = p.product_id;

Part 3: Subqueries

A query inside another query.

Example: Orders Above Average

query.sqlSQL
SELECT *
FROM orders
WHERE amount > (SELECT AVG(amount) FROM orders);

Subquery in FROM Clause

query.sqlSQL
SELECT 
  city,
  avg_revenue
FROM (
  SELECT 
    city,
    AVG(amount) AS avg_revenue
  FROM orders
  GROUP BY city
) AS city_stats
WHERE avg_revenue > 50000;

Real-World Example

Goal: Find top 5 customers by total spend in 2026.

query.sqlSQL
SELECT 
  c.name,
  COUNT(o.order_id) AS total_orders,
  SUM(o.amount) AS total_spent
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE YEAR(o.order_date) = 2026
GROUP BY c.customer_id, c.name
ORDER BY total_spent DESC
LIMIT 5;

Summary

โœ… Aggregate functions: COUNT, SUM, AVG, MIN, MAX โœ… GROUP BY groups data before aggregation โœ… HAVING filters groups after aggregation โœ… INNER JOIN returns matching rows from both tables โœ… LEFT JOIN keeps all left table rows โœ… Multiple joins combine 3+ tables โœ… Subqueries nest queries for complex logic

Next Topic: Python Basics for Analysts ๐Ÿš€