#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
5 min read min read

Hypothesis Testing

Learn to test claims with data

Hypothesis Testing

What is Hypothesis Testing?

A way to test claims using data.

Example: "Is this new drug effective?" or "Do customers prefer design A?"

The Two Hypotheses

Null Hypothesis (H₀)

  • The "nothing special" claim
  • What we assume is true
  • Example: "The drug has no effect"

Alternative Hypothesis (H₁)

  • What we want to prove
  • Example: "The drug works"

Steps of Hypothesis Testing

  1. State H₀ and H₁
  2. Collect data
  3. Calculate test statistic
  4. Find p-value
  5. Make decision

P-value

The probability of getting our result if H₀ is true.

  • Low p-value (< 0.05): Reject H₀, evidence for H₁
  • High p-value (≥ 0.05): Can't reject H₀
code.py
# If p-value < 0.05, we say result is "statistically significant"

One-Sample T-Test

Test if mean equals a specific value:

code.py
from scipy import stats
import numpy as np

# Company claims average battery lasts 10 hours
# Our sample of 30 batteries:
sample = [9.8, 10.2, 9.5, 10.1, 9.9, 10.3, 9.7, 10.0, 9.6, 10.1,
          9.9, 10.0, 9.8, 10.2, 9.7, 10.1, 9.9, 9.6, 10.0, 9.8,
          10.1, 9.7, 10.0, 9.9, 10.2, 9.8, 10.1, 9.6, 10.0, 9.9]

# H₀: mean = 10
# H₁: mean ≠ 10

t_stat, p_value = stats.ttest_1samp(sample, 10)

print(f"T-statistic: {t_stat:.3f}")
print(f"P-value: {p_value:.3f}")

if p_value < 0.05:
    print("Reject H₀: Mean is different from 10")
else:
    print("Cannot reject H₀: No evidence mean differs from 10")

Two-Sample T-Test

Compare means of two groups:

code.py
# Test scores: Class A vs Class B
class_a = [85, 90, 78, 92, 88, 76, 95, 89, 82, 91]
class_b = [78, 82, 75, 80, 77, 83, 79, 81, 76, 84]

# H₀: means are equal
# H₁: means are different

t_stat, p_value = stats.ttest_ind(class_a, class_b)

print(f"Class A mean: {np.mean(class_a):.1f}")
print(f"Class B mean: {np.mean(class_b):.1f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Significant difference between classes")
else:
    print("No significant difference")

Paired T-Test

Compare same group before and after:

code.py
# Weight before and after diet program
before = [180, 175, 190, 185, 170, 195, 180, 175, 185, 190]
after = [175, 172, 185, 180, 168, 188, 176, 170, 182, 185]

# H₀: no change (mean difference = 0)
# H₁: there is change

t_stat, p_value = stats.ttest_rel(before, after)

print(f"Average weight loss: {np.mean(before) - np.mean(after):.1f} lbs")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Diet program is effective!")

One-Tailed vs Two-Tailed

Two-tailed: Testing if different (≠) One-tailed: Testing if greater (>) or less (<)

code.py
# One-tailed: Is new method better?
# Divide p-value by 2 and check direction

t_stat, p_value = stats.ttest_ind(class_a, class_b)

# For one-tailed (Class A > Class B)
if t_stat > 0 and p_value/2 < 0.05:
    print("Class A is significantly better")

Type I and Type II Errors

H₀ TrueH₀ False
Reject H₀Type I Error (α)Correct!
Don't RejectCorrect!Type II Error (β)
  • Type I: False positive (see effect that isn't there)
  • Type II: False negative (miss real effect)

Significance Level (α)

  • Usually α = 0.05 (5%)
  • Lower α = harder to reject H₀
  • Common values: 0.01, 0.05, 0.10
code.py
alpha = 0.05

if p_value < alpha:
    print("Reject H₀ at 5% significance level")

Complete Example

code.py
from scipy import stats
import numpy as np

# A/B Test: Does new website design increase time on site?
# Control: old design, Treatment: new design

np.random.seed(42)
control = np.random.normal(120, 30, 100)    # seconds on site
treatment = np.random.normal(135, 35, 100)  # seconds on site

# H₀: No difference between designs
# H₁: New design increases time

print("=== A/B Test Results ===")
print(f"Control mean: {np.mean(control):.1f} seconds")
print(f"Treatment mean: {np.mean(treatment):.1f} seconds")
print(f"Difference: {np.mean(treatment) - np.mean(control):.1f} seconds")

t_stat, p_value = stats.ttest_ind(treatment, control)

print(f"\nT-statistic: {t_stat:.3f}")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("\n✓ New design significantly increases time on site!")
else:
    print("\n✗ No significant difference found")

Key Points

  • H₀: Null hypothesis (no effect)
  • H₁: Alternative hypothesis (effect exists)
  • p-value < 0.05: Reject H₀
  • Use ttest_1samp for one sample vs value
  • Use ttest_ind for two independent groups
  • Use ttest_rel for paired/before-after
  • Watch out for Type I and Type II errors

What's Next?

Learn about T-tests and Chi-Square tests in detail.