Zomato: Company Context
Zomato started in 2008 as a restaurant discovery platform (menu listings, reviews) and evolved into India's leading food delivery marketplace. After merging with Blinkit (quick commerce) and expanding into dining-out services, Zomato operates across the entire food ecosystem.
Key Metrics (2026)
- 80+ million monthly active users (MAU)
- 250,000+ restaurant partners across 1,000+ cities
- 180,000+ delivery partners
- 1 million+ orders/day (365M annually)
- ₹9,000+ crore annual revenue (2025)
- Zomato Gold: 5M+ paid members (dining + delivery subscriptions)
Data Infrastructure
Zomato's analytics runs on:
- User behavior tracking: Clickstream data capturing every search, filter, restaurant view, order
- Geospatial database: Real-time location tracking of delivery partners + restaurant proximity
- ML platform: Recommendation engine, delivery time prediction, churn prediction models
- A/B testing framework: 50+ experiments running on app UI, pricing, notifications
- Operational dashboards: Real-time monitoring of order flow, delivery SLAs, restaurant performance
Analytics Team Structure
- Growth Analytics: User acquisition, activation, retention, monetization (AARRR funnel)
- Restaurant Analytics: Partner onboarding, menu optimization, demand forecasting for restaurants
- Delivery Analytics: Route optimization, ETA prediction, delivery partner allocation
- Product Analytics: Feature adoption, app engagement, search relevance
- Finance Analytics: Unit economics, profitability by city/restaurant, CAC/LTV modeling
Zomato's analytics system is like a matchmaking platform for food — understanding what you crave (Mexican at 2 PM vs comfort food at 11 PM), which restaurants can fulfill it fastest, and how to keep you coming back (personalized offers, loyalty rewards) — all optimized through millions of data points collected daily.
The Business Problems
Zomato faces three core analytics challenges:
1. Restaurant Discovery: Paradox of Choice
Problem: With 250K+ restaurants, users get overwhelmed and abandon the app without ordering.
Challenge:
- Search ambiguity: User searches "pizza" → 5,000 results in Bangalore (which to show first?)
- Preference diversity: Same user orders sushi (healthy) and biryani (comfort food) on different days
- Context matters: Lunch searches prioritize speed (fast delivery), dinner prioritizes quality (ratings)
- Cold start: New users have no order history (what to recommend?)
Traditional approach: Rank restaurants by ratings + popularity → Result: 40% of users don't find what they want (high bounce rate)
Data-driven approach: Personalized ranking using collaborative filtering + contextual signals → Result: 25% bounce rate (38% improvement) + 3.2× higher order conversion
2. Delivery Time Accuracy: The 30-Minute Promise
Problem: Inaccurate ETAs lead to refunds, bad reviews, and customer churn.
Challenge:
- Restaurant delays: Food prep time varies (15-45 min depending on kitchen load)
- Traffic variability: Same route takes 8 min (midnight) vs 25 min (rush hour)
- Weather impact: Rain increases delivery time by 35-40%
- Last-mile complexity: Apartment complex deliveries take 5-10 min longer (parking, security, finding flat)
Traditional approach: Fixed 30-40 min ETA for all orders → Result: 20% late deliveries (refund costs ₹25 crore/year)
Data-driven approach: ML-powered ETA prediction with real-time adjustments → Result: 8% late deliveries (60% reduction) + ±3 min accuracy
3. Customer Retention: High Churn in Food Delivery
Problem: Only 25% of first-time users place a second order within 30 days (75% churn).
Typical customer journey:
App Install → First Order → Second Order (30 days) → Active User (3+ orders/month)
100 users → 100 → 25 → 12
Drop-off: 0% 75% 52%
Key insights from analytics:
- Poor first experience: 40% of first orders have issues (late delivery, wrong order, quality complaints)
- Price sensitivity: 60% of churned users cite "too expensive" (vs cooking at home)
- Lack of variety: Users try 1-2 cuisines, then forget about the app
- No habit formation: Food delivery is episodic (weekend treat), not daily habit like groceries
Data-driven solutions:
- Personalized win-back campaigns (discount offers to churned users)
- Zomato Gold (subscription for frequent users)
- Push notifications at meal times (lunch 12-1 PM, dinner 7-9 PM)
Scale context: Reducing first-month churn from 75% → 60% = 15,000 additional retained users/month at Zomato's scale. Over 12 months, that's 180,000 users × ₹500 avg LTV = ₹90 crore additional revenue.
Data They Used & Analytics Approach
1. Restaurant Recommendations: Collaborative Filtering + Contextual Ranking
Data sources:
# User order history
{
"user_id": "U12345",
"order_history": [
{"restaurant_id": "R001", "cuisine": "North Indian", "order_time": "13:15", "rating": 4},
{"restaurant_id": "R045", "cuisine": "Chinese", "order_time": "20:30", "rating": 5},
{"restaurant_id": "R122", "cuisine": "Italian", "order_time": "21:00", "rating": 4}
],
"search_queries": ["pizza near me", "biryani", "healthy salad"],
"filters_used": ["veg_only", "rating_4_plus", "delivery_time_30min"]
}
# Restaurant metadata
{
"restaurant_id": "R001",
"name": "Punjab Grill",
"cuisine": ["North Indian", "Mughlai"],
"avg_rating": 4.2,
"avg_delivery_time": 35,
"price_for_two": 800,
"location_lat_lon": (12.9716, 77.5946),
"popular_dishes": ["Butter Chicken", "Dal Makhani", "Naan"]
}SQL: Find restaurants similar to user's past orders
-- Collaborative filtering: Users who ordered from Restaurant A also ordered from Restaurant B
WITH user_restaurant_pairs AS (
SELECT
o1.user_id,
o1.restaurant_id AS restaurant_a,
o2.restaurant_id AS restaurant_b,
o1.order_date
FROM orders o1
JOIN orders o2
ON o1.user_id = o2.user_id
AND o1.restaurant_id != o2.restaurant_id
WHERE o1.order_date >= CURRENT_DATE - INTERVAL '90 days'
),
restaurant_similarity AS (
SELECT
restaurant_a,
restaurant_b,
COUNT(DISTINCT user_id) AS users_ordered_both,
-- Confidence: If user orders from A, probability they also order from B
COUNT(DISTINCT user_id) * 100.0 /
(SELECT COUNT(DISTINCT user_id) FROM orders WHERE restaurant_id = restaurant_a)
AS confidence_pct
FROM user_restaurant_pairs
GROUP BY restaurant_a, restaurant_b
HAVING COUNT(DISTINCT user_id) >= 20 -- Minimum support
)
SELECT
ra.name AS restaurant_a_name,
rb.name AS restaurant_b_name,
rs.confidence_pct,
rb.avg_rating,
rb.avg_delivery_time,
rb.price_for_two
FROM restaurant_similarity rs
JOIN restaurants ra ON rs.restaurant_a = ra.restaurant_id
JOIN restaurants rb ON rs.restaurant_b = rb.restaurant_id
WHERE rs.restaurant_a = 'R001' -- Punjab Grill (user's favorite)
ORDER BY rs.confidence_pct DESC
LIMIT 10;Python: Contextual ranking (time of day, cuisine preference)
import pandas as pd
import numpy as np
from datetime import datetime
def rank_restaurants(user_id, user_context, restaurant_list):
"""
Rank restaurants based on collaborative filtering + contextual factors
Args:
user_id: User identifier
user_context: {'hour': 13, 'day_of_week': 'Friday', 'weather': 'rainy'}
restaurant_list: List of candidate restaurants (from collaborative filtering)
Returns:
Ranked restaurant list with scores
"""
results = []
for restaurant in restaurant_list:
score = 0
# Base score: Collaborative filtering confidence (0-100)
score += restaurant['cf_confidence']
# Contextual adjustments
# Time of day preference (lunch: quick delivery, dinner: quality)
if 12 <= user_context['hour'] <= 14: # Lunch
if restaurant['avg_delivery_time'] <= 30:
score += 20 # Prioritize fast delivery
elif 19 <= user_context['hour'] <= 22: # Dinner
if restaurant['avg_rating'] >= 4.0:
score += 20 # Prioritize high-rated
# Weather context (rainy day: comfort food)
if user_context['weather'] == 'rainy':
if restaurant['cuisine'] in ['North Indian', 'Chinese', 'Italian']:
score += 15 # Comfort food cuisines
# Price sensitivity (Friday/weekend: less price-sensitive)
if user_context['day_of_week'] in ['Friday', 'Saturday', 'Sunday']:
score += 5 # All restaurants benefit (users order more expensive on weekends)
else:
if restaurant['price_for_two'] <= 400:
score += 10 # Prioritize budget options on weekdays
# Distance penalty (farther = longer delivery = lower score)
distance_km = restaurant['distance_km']
if distance_km <= 2:
score += 10
elif distance_km <= 4:
score += 5
else:
score -= 5 # Penalize distant restaurants
results.append({
'restaurant_id': restaurant['restaurant_id'],
'name': restaurant['name'],
'final_score': score,
'avg_rating': restaurant['avg_rating'],
'delivery_time': restaurant['avg_delivery_time'],
'price_for_two': restaurant['price_for_two']
})
# Sort by score descending
results_sorted = sorted(results, key=lambda x: x['final_score'], reverse=True)
return results_sorted
# Example usage
user_context = {
'hour': 13,
'day_of_week': 'Wednesday',
'weather': 'clear'
}
candidate_restaurants = [
{'restaurant_id': 'R001', 'name': 'Punjab Grill', 'cf_confidence': 85, 'cuisine': 'North Indian',
'avg_rating': 4.2, 'avg_delivery_time': 35, 'price_for_two': 800, 'distance_km': 3.2},
{'restaurant_id': 'R002', 'name': 'Chinese Wok', 'cf_confidence': 75, 'cuisine': 'Chinese',
'avg_rating': 3.9, 'avg_delivery_time': 25, 'price_for_two': 350, 'distance_km': 1.8},
{'restaurant_id': 'R003', 'name': 'Wow! Momo', 'cf_confidence': 70, 'cuisine': 'Tibetan',
'avg_rating': 4.0, 'avg_delivery_time': 20, 'price_for_two': 300, 'distance_km': 2.5}
]
ranked = rank_restaurants('U12345', user_context, candidate_restaurants)
print("Ranked Restaurants for Lunch (Wednesday):")
for i, r in enumerate(ranked, 1):
print(f"{i}. {r['name']} (Score: {r['final_score']}, Rating: {r['avg_rating']}, "
f"Delivery: {r['delivery_time']}min, Price: ₹{r['price_for_two']})")
# Output:
# 1. Wow! Momo (Score: 105, Rating: 4.0, Delivery: 20min, Price: ₹300)
# 2. Chinese Wok (Score: 105, Rating: 3.9, Delivery: 25min, Price: ₹350)
# 3. Punjab Grill (Score: 95, Rating: 4.2, Delivery: 35min, Price: ₹800)
#
# Reason: Lunchtime prioritizes fast delivery + budget-friendly (Wow! Momo, Chinese Wok win)
# Punjab Grill ranked lower despite higher CF confidence because slower delivery + expensiveResult: Personalized ranking increased order conversion from 8% → 12% (+50% lift)
2. Churn Prediction & Win-Back Campaigns
SQL: Identify at-risk users (no order in 30 days)
-- Cohort analysis: Users who ordered in Jan 2026, retention over next 3 months
WITH jan_cohort AS (
SELECT DISTINCT user_id
FROM orders
WHERE order_date BETWEEN '2026-01-01' AND '2026-01-31'
),
monthly_activity AS (
SELECT
jc.user_id,
DATE_TRUNC('month', o.order_date) AS order_month,
COUNT(o.order_id) AS orders_count
FROM jan_cohort jc
LEFT JOIN orders o ON jc.user_id = o.user_id
AND o.order_date >= '2026-01-01'
AND o.order_date < '2026-05-01'
GROUP BY jc.user_id, DATE_TRUNC('month', o.order_date)
)
SELECT
order_month,
COUNT(DISTINCT user_id) AS active_users,
SUM(orders_count) AS total_orders,
SUM(orders_count) * 1.0 / COUNT(DISTINCT user_id) AS avg_orders_per_user
FROM monthly_activity
GROUP BY order_month
ORDER BY order_month;
-- Churn prediction: Users likely to churn (ML feature engineering)
SELECT
u.user_id,
u.email,
u.signup_date,
CURRENT_DATE - MAX(o.order_date) AS days_since_last_order,
COUNT(o.order_id) AS total_orders,
AVG(o.order_value) AS avg_order_value,
AVG(o.delivery_rating) AS avg_delivery_rating,
-- Churn risk flag
CASE
WHEN CURRENT_DATE - MAX(o.order_date) > 30 AND COUNT(o.order_id) >= 3 THEN 'HIGH_RISK'
WHEN CURRENT_DATE - MAX(o.order_date) > 45 THEN 'MEDIUM_RISK'
ELSE 'ACTIVE'
END AS churn_risk_segment
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.email, u.signup_date
HAVING COUNT(o.order_id) > 0 -- Exclude never-ordered users
ORDER BY days_since_last_order DESC;Python: Personalized win-back offer
# Win-back campaign: Personalized discount based on user value
def generate_winback_offer(user_segment, user_ltv):
"""
Generate personalized discount offer to win back churned users
Args:
user_segment: 'HIGH_RISK', 'MEDIUM_RISK', 'ACTIVE'
user_ltv: Lifetime value (total spent) ₹
Returns:
Discount offer dictionary
"""
offers = {
'HIGH_RISK': {
'ltv_0_1000': {'discount': 100, 'min_order': 199, 'message': 'We miss you! ₹100 OFF on your next order'},
'ltv_1000_5000': {'discount': 150, 'min_order': 299, 'message': 'Come back! ₹150 OFF on ₹299+'},
'ltv_5000_plus': {'discount': 250, 'min_order': 499, 'message': 'Special offer! ₹250 OFF on ₹499+'}
},
'MEDIUM_RISK': {
'ltv_0_1000': {'discount': 50, 'min_order': 199, 'message': '₹50 OFF your next order'},
'ltv_1000_5000': {'discount': 75, 'min_order': 249, 'message': '₹75 OFF on ₹249+'},
'ltv_5000_plus': {'discount': 100, 'min_order': 299, 'message': '₹100 OFF on ₹299+'}
}
}
# Determine LTV bucket
if user_ltv < 1000:
ltv_bucket = 'ltv_0_1000'
elif user_ltv < 5000:
ltv_bucket = 'ltv_1000_5000'
else:
ltv_bucket = 'ltv_5000_plus'
# Return personalized offer
return offers[user_segment][ltv_bucket]
# Example
offer = generate_winback_offer('HIGH_RISK', 3500)
print(offer)
# Output: {'discount': 150, 'min_order': 299, 'message': 'Come back! ₹150 OFF on ₹299+'}Result: Win-back campaigns with personalized offers recovered 18% of at-risk users (vs 5% with generic offers)
⚠️ CheckpointQuiz error: Missing or invalid options array
Key Results & Impact
1. Restaurant Discovery Improvements
Before personalization (generic ranking):
- Order conversion rate: 8% (92% of users browsed without ordering)
- Avg time to order: 12 minutes (high friction)
- New user activation: 35% (placed first order within 7 days)
After personalization (collaborative filtering + contextual ranking):
- Order conversion rate: 12% (+50% lift)
- Avg time to order: 8 minutes (33% faster)
- New user activation: 48% (+37% improvement)
Revenue impact: ₹800+ crore additional GMV from improved discovery
2. Delivery Time Prediction Accuracy
Metric improvements:
- ETA accuracy: ±3 minutes (vs ±8 minutes with fixed ETAs)
- Late delivery rate: 8% (down from 20%)
- Refund costs: ₹10 crore/year (down from ₹25 crore)
- Customer satisfaction: 4.2/5 (up from 3.6/5)
3. Customer Retention & LTV
Cohort analysis results (Jan 2026 cohort):
| Month | Active Users | Retention % | Avg Orders/User | Revenue/User | |-------|--------------|-------------|-----------------|--------------| | Jan (M0) | 100,000 | 100% | 1.0 | ₹350 | | Feb (M1) | 32,000 | 32% | 1.5 | ₹525 | | Mar (M2) | 22,000 | 22% | 2.1 | ₹735 | | Apr (M3) | 18,000 | 18% | 2.5 | ₹875 |
Impact of win-back campaigns:
- Without campaigns: Month 1 retention = 25%
- With personalized offers: Month 1 retention = 32% (+28% lift)
- Recovered users: 7,000 per 100K cohort × ₹500 avg LTV = ₹35 lakh per cohort
Zomato Gold impact: 5M+ paid members (₹149/month subscription). Members order 3× more frequently than non-members (4.5 orders/month vs 1.5). Gold membership drives ₹2,000+ crore annual GMV (25% of total revenue).
What You Can Learn from Zomato
1. Context Matters as Much as Patterns
Key insight: Collaborative filtering finds patterns (what users generally like), but context determines relevance (what users want RIGHT NOW).
How to apply this:
- When building recommendation systems, always add contextual features:
- Time of day (breakfast vs dinner preferences)
- Day of week (weekday budget vs weekend splurge)
- Weather (rainy day comfort food vs sunny day salads)
- Location (home vs office vs traveling)
Portfolio project idea: "Food delivery recommendation system with collaborative filtering + contextual ranking using Zomato/Swiggy public data"
2. Retention > Acquisition (Fix Churn Before Scaling Ads)
Key insight: Zomato's biggest problem isn't getting new users (app installs are cheap) — it's keeping them (75% churn after first order).
The math:
# Scenario A: Focus on acquisition (no retention fix)
new_users_per_month = 100000
month_1_retention = 0.25 # 75% churn
month_3_retention = 0.15
cac = 150 # Cost to acquire one user (ads)
ltv = 500 # Lifetime value per user
total_cost = new_users_per_month * cac # ₹1.5 crore
total_revenue = new_users_per_month * month_3_retention * ltv # ₹75 lakh
roi = (total_revenue - total_cost) / total_cost * 100 # -50% (losing money!)
# Scenario B: Fix retention first, then scale acquisition
new_users_per_month = 100000
month_1_retention = 0.40 # Improved from 25% → 40% (win-back campaigns)
month_3_retention = 0.25 # Improved from 15% → 25%
cac = 150
ltv = 800 # Higher LTV (retained users order more)
total_cost = new_users_per_month * cac # ₹1.5 crore
total_revenue = new_users_per_month * month_3_retention * ltv # ₹2 crore
roi = (total_revenue - total_cost) / total_cost * 100 # +33% (profitable!)Lesson: Fix the leaky bucket (churn) before pouring more water (acquisition). Use cohort analysis to measure retention.
3. Personalization Works at All Stages (Not Just Recommendations)
Key insight: Zomato personalizes everything — recommendations, offers, notifications, email subject lines.
Examples:
- Recommendations: Contextual ranking (lunch vs dinner)
- Offers: Win-back discounts based on LTV (high-value users get better offers)
- Notifications: Sent at user's typical order time (12:30 PM lunch, 8 PM dinner)
- Email subject lines: A/B tested ("Order your favorite biryani" vs "10% OFF today")
How to apply this to job search:
-
Generic cover letter: "I'm a data analyst with SQL and Python skills" → Recruiter thinks: "Like 100 other applicants"
-
Personalized cover letter: "I noticed Zomato is hiring for a Growth Analyst role focused on retention. I built a churn prediction model using cohort analysis and win-back campaigns (see portfolio project), which aligns with your need for reducing Month 1 churn from 30% → 20%." → Recruiter thinks: "This person understands our problem!"
The best analysts personalize everything — just like Zomato personalizes for each user.
⚠️ FinalQuiz error: Missing or invalid questions array
⚠️ SummarySection error: Missing or invalid items array
Received: {"hasItems":false,"isArray":false}