Topic 4 of 12

Introduction to SQL

SQL is the #1 skill employers look for in data analysts. Master these basics and unlock access to billions of rows of data.

📚Beginner
⏱️18 min
6 quizzes

What is SQL?

SQL (Structured Query Language) is the universal language for talking to databases. Think of it as asking questions to your data warehouse.

Real-world analogy: Imagine a library with millions of books. SQL is how you ask the librarian: "Show me all sci-fi books published after 2020, sorted by rating."

Why SQL Matters for Data Analysts

Most requested skill: 80% of data analyst jobs require SQL ✅ Handles massive data: Query millions of rows in seconds ✅ Industry standard: Works with MySQL, PostgreSQL, SQL Server, Oracle, BigQuery ✅ No UI limits: Excel crashes at 1M rows; SQL doesn't

Industry Insight: Knowing SQL increases your starting salary by ₹2-4 LPA on average.


Database Basics

What is a Database?

A database is an organized collection of data stored in tables (like Excel sheets, but way more powerful).

Example: E-commerce Database

Table: customers | customer_id | name | email | city | |-------------|------|-------|------| | 1 | Rahul | r@ex.com | Mumbai | | 2 | Priya | p@ex.com | Delhi |

Table: orders | order_id | customer_id | product | amount | |----------|-------------|---------|--------| | 101 | 1 | Laptop | 45000 | | 102 | 2 | Mouse | 500 |

Key Concepts

  • Table: Like an Excel sheet (rows = records, columns = fields)
  • Row: A single record (one customer, one order)
  • Column: A field (name, email, amount)
  • Primary Key: Unique identifier (customer_id, order_id)
  • Foreign Key: Links tables together (customer_id in orders table)

Your First SQL Query: SELECT

The most important SQL command. It's how you retrieve data.

Basic Syntax

query.sqlSQL
SELECT column1, column2
FROM table_name;

Example 1: Select Specific Columns

query.sqlSQL
SELECT name, email
FROM customers;

Result: | name | email | |-------|----------| | Rahul | r@ex.com | | Priya | p@ex.com |

Example 2: Select All Columns

query.sqlSQL
SELECT *
FROM customers;

* means "all columns"

When to use:

  • Exploring new data → Use SELECT *
  • Production queries → List specific columns (faster, clearer)

Filtering with WHERE

WHERE lets you filter rows based on conditions.

Syntax

query.sqlSQL
SELECT column1, column2
FROM table_name
WHERE condition;

Example: Filter by City

query.sqlSQL
SELECT name, city
FROM customers
WHERE city = 'Mumbai';

Result: Only customers from Mumbai

Comparison Operators

| Operator | Meaning | Example | |----------|---------|---------| | = | Equals | WHERE city = 'Delhi' | | != or <> | Not equals | WHERE city != 'Mumbai' | | > | Greater than | WHERE amount > 1000 | | < | Less than | WHERE amount < 500 | | >= | Greater or equal | WHERE amount >= 1000 | | <= | Less or equal | WHERE amount <= 500 |

Example: Numeric Filters

query.sqlSQL
SELECT product, amount
FROM orders
WHERE amount > 10000;

Result: Only orders above ₹10,000


Combining Conditions: AND, OR, NOT

AND — Both conditions must be true

query.sqlSQL
SELECT name, city
FROM customers
WHERE city = 'Mumbai' AND age > 25;

OR — At least one condition must be true

query.sqlSQL
SELECT name, city
FROM customers
WHERE city = 'Mumbai' OR city = 'Delhi';

NOT — Opposite of the condition

query.sqlSQL
SELECT name, city
FROM customers
WHERE NOT city = 'Mumbai';
// Same as: WHERE city != 'Mumbai'

Complex Example

query.sqlSQL
SELECT product, amount
FROM orders
WHERE (city = 'Mumbai' OR city = 'Delhi')
  AND amount > 5000
  AND status = 'Delivered';

Pattern Matching with LIKE

LIKE searches for patterns in text.

Wildcards

  • % = any number of characters
  • _ = exactly one character

Examples

Names starting with 'R':

query.sqlSQL
SELECT name FROM customers WHERE name LIKE 'R%';
// Returns: Rahul, Rohan, Riya

Names ending with 'a':

query.sqlSQL
SELECT name FROM customers WHERE name LIKE '%a';
// Returns: Priya, Neha

Email from Gmail:

query.sqlSQL
SELECT email FROM customers WHERE email LIKE '%@gmail.com';

Product codes with 'LAP' anywhere:

query.sqlSQL
SELECT product FROM orders WHERE product LIKE '%LAP%';

Sorting Results: ORDER BY

ORDER BY sorts your results.

Syntax

query.sqlSQL
SELECT column1, column2
FROM table_name
ORDER BY column1 ASC;  -- Ascending (default)
ORDER BY column1 DESC; -- Descending

Example: Sort by Amount (Highest First)

query.sqlSQL
SELECT product, amount
FROM orders
ORDER BY amount DESC;

Multiple Sort Columns

query.sqlSQL
SELECT name, city, age
FROM customers
ORDER BY city ASC, age DESC;

Result: Sorted by city (A-Z), then within each city by age (highest first)


Limiting Results: LIMIT

LIMIT restricts the number of rows returned.

Syntax

query.sqlSQL
SELECT column1, column2
FROM table_name
LIMIT number;

Example: Top 10 Orders

query.sqlSQL
SELECT product, amount
FROM orders
ORDER BY amount DESC
LIMIT 10;

Pro Tip: Always use LIMIT when exploring large tables to avoid crashing your system.

Pagination (OFFSET)

query.sqlSQL
SELECT name, email
FROM customers
LIMIT 10 OFFSET 20;

Result: Skip first 20 rows, show next 10 (rows 21-30)


Handling Missing Data: NULL

NULL means "no value" (not zero, not empty string — literally nothing).

Check for NULL

query.sqlSQL
SELECT name, phone
FROM customers
WHERE phone IS NULL;

Check for NOT NULL

query.sqlSQL
SELECT name, phone
FROM customers
WHERE phone IS NOT NULL;

Common Mistake: Don't use = NULL — it won't work! Always use IS NULL or IS NOT NULL.


Filtering with IN

IN checks if a value matches any item in a list.

Example: Multiple Cities

Instead of:

query.sqlSQL
WHERE city = 'Mumbai' OR city = 'Delhi' OR city = 'Bangalore'

Use:

query.sqlSQL
WHERE city IN ('Mumbai', 'Delhi', 'Bangalore')

Example: Exclude Certain Values

query.sqlSQL
SELECT name, status
FROM orders
WHERE status NOT IN ('Cancelled', 'Returned');

BETWEEN — Range Filtering

BETWEEN checks if a value falls within a range (inclusive).

Example: Orders Between ₹5K and ₹20K

query.sqlSQL
SELECT product, amount
FROM orders
WHERE amount BETWEEN 5000 AND 20000;

Equivalent to:

query.sqlSQL
WHERE amount >= 5000 AND amount <= 20000

Example: Date Ranges

query.sqlSQL
SELECT order_id, order_date
FROM orders
WHERE order_date BETWEEN '2026-01-01' AND '2026-01-31';

DISTINCT — Remove Duplicates

DISTINCT returns only unique values.

Example: All Unique Cities

query.sqlSQL
SELECT DISTINCT city
FROM customers;

Result: Each city appears only once

Count Unique Values

query.sqlSQL
SELECT COUNT(DISTINCT city) AS unique_cities
FROM customers;

Aliasing with AS

AS renames columns or tables (makes results clearer).

Column Aliases

query.sqlSQL
SELECT
  name AS customer_name,
  email AS contact_email,
  amount * 1.18 AS amount_with_gst
FROM orders;

Result Columns: customer_name, contact_email, amount_with_gst

Calculations in SELECT

query.sqlSQL
SELECT
  product,
  price,
  quantity,
  price * quantity AS total_value
FROM orders;

Putting It All Together: Real Query

Goal: Find high-value orders from Mumbai in January 2026, sorted by amount.

query.sqlSQL
SELECT
  order_id,
  customer_name,
  product,
  amount,
  order_date
FROM orders
WHERE city = 'Mumbai'
  AND order_date BETWEEN '2026-01-01' AND '2026-01-31'
  AND amount > 10000
  AND status = 'Delivered'
ORDER BY amount DESC
LIMIT 20;

This query:

  1. Selects specific columns
  2. Filters by city, date range, amount, and status
  3. Sorts by amount (highest first)
  4. Limits to top 20 results

SQL Best Practices

1. Use Comments

query.sqlSQL
-- This query finds top customers
SELECT name, SUM(amount) AS total_spent
FROM orders
WHERE status = 'Delivered' -- Only completed orders
GROUP BY name;

2. Format for Readability

Bad:

query.sqlSQL
SELECT name,email,city FROM customers WHERE city='Mumbai' ORDER BY name;

Good:

query.sqlSQL
SELECT
  name,
  email,
  city
FROM customers
WHERE city = 'Mumbai'
ORDER BY name;

3. Use Meaningful Aliases

Bad: SELECT SUM(a) AS x Good: SELECT SUM(amount) AS total_revenue

4. Always LIMIT When Testing

query.sqlSQL
SELECT * FROM huge_table LIMIT 100; -- Safe exploration

Common Beginner Mistakes

❌ Mistake 1: Forgetting Quotes Around Text

Wrong:

query.sqlSQL
WHERE city = Mumbai

Right:

query.sqlSQL
WHERE city = 'Mumbai'

❌ Mistake 2: Using = NULL

Wrong:

query.sqlSQL
WHERE phone = NULL

Right:

query.sqlSQL
WHERE phone IS NULL

❌ Mistake 3: Case Sensitivity in Strings

Some databases are case-sensitive:

query.sqlSQL
WHERE name = 'rahul'  -- Won't match 'Rahul'

Solution: Use LOWER() or UPPER()

query.sqlSQL
WHERE LOWER(name) = 'rahul'

Summary

SELECT retrieves data from tables ✅ WHERE filters rows based on conditions ✅ AND/OR/NOT combine multiple conditions ✅ LIKE searches text patterns ✅ ORDER BY sorts results (ASC/DESC) ✅ LIMIT restricts number of rows ✅ DISTINCT removes duplicates ✅ NULL handling with IS NULL / IS NOT NULL ✅ IN checks against multiple values ✅ BETWEEN filters ranges ✅ AS creates aliases for readability

Next Topic: SQL: Joins, Aggregations & Subqueries

Ready for the advanced stuff? Let's combine tables and calculate metrics! 🚀