What is Swiggy Data Analytics: How India's Food Delivery Giant Uses Data?

Learn how Swiggy uses geospatial analytics, ML forecasting, dynamic pricing, and real-time dashboards to optimize delivery times, reduce costs, and predict demand across 600+ cities.

Is Swiggy Data Analytics: How India's Food Delivery Giant Uses Data suitable for beginners?

This topic is designed for Intermediate level learners. It takes approximately 11 min to complete and includes 10 interactive quizzes to test your understanding.

How long does it take to learn Swiggy Data Analytics: How India's Food Delivery Giant Uses Data?

You can complete this topic in about 11 min. The topic is part 68 of undefined in our comprehensive Data Analytics Learning Path.

Swiggy Data Analytics Case Study — Delivery Optimization & Demand Forecasting | DataPath

🏢

Swiggy: Company Context

Swiggy is India's largest food delivery platform, founded in 2014 by Sriharsha Majety, Nandan Reddy, and Rahul Jaimini. Operating in 600+ cities, Swiggy has transformed how Indians order food.

Key Metrics (2026)

1.5 million orders/day (550+ million annually)
350,000+ restaurant partners
300,000+ delivery executives
600+ cities covered
30-minute average delivery time
₹8,000+ crore annual revenue

Data Infrastructure

Swiggy's analytics runs on:

Geospatial database: Real-time location tracking of 300K delivery partners
Streaming pipeline: Apache Kafka processing 50K events/second (orders, GPS pings, restaurant updates)
ML platform: Demand forecasting, ETA prediction, dynamic pricing models
Time-series database: Historical order patterns, weather data, traffic conditions
A/B testing framework: 100+ live experiments on pricing, UI, recommendations

Analytics Team Structure

Delivery Analytics: Route optimization, ETA prediction, fleet management
Demand Analytics: Order forecasting, restaurant capacity planning, surge pricing
Growth Analytics: Customer acquisition, retention, LTV modeling
Operations Analytics: Restaurant onboarding, quality control, fraud detection
Product Analytics: App usage funnels, feature adoption, personalization

Think of it this way...

Swiggy's analytics system is like air traffic control for food delivery — tracking 300,000 delivery partners in real-time, predicting where demand will spike in the next 30 minutes, and dynamically rerouting orders to minimize delivery time. Every second of delay costs money; every optimization saves lakhs.

🎯

The Business Problem

Swiggy faces three critical analytics challenges:

1. Delivery Time Optimization

Problem: Late deliveries = refunds + bad reviews + customer churn.

Challenge:

Traffic variability: Same route takes 10 min (midnight) vs 35 min (rush hour)
Weather impact: Rain increases delivery time by 40%+
Restaurant delays: Food not ready when delivery partner arrives
Distance vs speed trade-off: Assign nearest delivery partner or fastest available?

Traditional approach: Fixed 30-min delivery promise for all orders → Result: 25% late deliveries (refunds cost ₹50 crore/year)

Data-driven approach: Dynamic ETA prediction using ML → Result: 8% late deliveries (67% reduction in refund costs)

2. Demand Forecasting for Peak Hours

Problem: Too few delivery partners = long wait times; too many = idle partners (wasted cost).

Challenge:

Lunch rush: 12-2 PM sees 3× normal demand
Dinner peak: 7-10 PM sees 5× normal demand
Weekend spikes: Saturday dinner 2× higher than weekday
Event-driven surges: India vs Pakistan cricket match = 10× spike in specific zones
Weather dependency: Rainy days see 60% more orders (people avoid going out)

Traditional approach: Fixed fleet size throughout the day → Result: 15-min wait times during peak + 40% idle capacity during off-peak

Data-driven approach: Hourly demand forecasting with dynamic fleet allocation → Result: 5-min average wait time + 15% idle capacity (60% cost savings)

3. Restaurant Recommendations

Problem: Users spend 8-10 minutes browsing before ordering (high friction).

Typical user journey:

App Open → Search/Browse → Restaurant Page → Menu → Checkout → Payment
100K users → 85K → 35K → 28K → 22K → 20K

Drop-off:      15%      59%      20%      21%      9%

Key insights from analytics:

Paradox of choice: Showing 100 restaurants overwhelms users (60% exit without ordering)
Search mismatch: Users search "biryani" but see irrelevant results (Chinese, pizza)
Price sensitivity: 70% of users filter by "under ₹300" but default shows ₹500+ restaurants first
Delivery time: 45% prefer "fastest delivery" over "best rated"

Data-driven solutions: Personalized restaurant ranking using collaborative filtering + contextual factors (time of day, past orders, weather).

Info

Scale context: Reducing average browse time by 1 minute = 500,000 hours saved daily for users + 5% higher conversion (₹400 crore additional revenue/year).

🔬

Data They Used & Analytics Approach

1. Delivery Optimization: Geospatial Analytics

Data sources:

code.pyPython

# Real-time delivery partner location (GPS pings every 10 seconds)
{
  "partner_id": "DP12345",
  "lat": 12.9716,
  "lon": 77.5946,
  "timestamp": "2026-03-24 19:45:23",
  "status": "idle",  # idle | en_route_to_restaurant | picked_up | delivering
  "current_order": null
}

# Historical delivery data
{
  "order_id": "O987654",
  "restaurant_lat_lon": (12.9352, 77.6245),
  "customer_lat_lon": (12.9698, 77.6450),
  "actual_delivery_time": 28,  # minutes
  "predicted_delivery_time": 25,
  "traffic_level": "medium",
  "weather": "clear",
  "hour_of_day": 19,
  "day_of_week": "Friday"
}

Analytics approach: ETA Prediction Model

Swiggy uses a gradient boosting model (XGBoost) to predict delivery time:

code.pyPython

import xgboost as xgb
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Feature engineering
def extract_features(order_data):
    """Convert raw order data to ML features"""
    features = pd.DataFrame()

    # Distance features
    features['haversine_distance'] = calculate_distance(
        order_data['restaurant_lat_lon'],
        order_data['customer_lat_lon']
    )

    # Time features
    features['hour'] = order_data['timestamp'].dt.hour
    features['day_of_week'] = order_data['timestamp'].dt.dayofweek
    features['is_weekend'] = (features['day_of_week'] >= 5).astype(int)
    features['is_peak_hour'] = ((features['hour'] >= 12) & (features['hour'] <= 14) |
                                 (features['hour'] >= 19) & (features['hour'] <= 21)).astype(int)

    # Weather features
    features['is_raining'] = (order_data['weather'] == 'rain').astype(int)
    features['temperature'] = order_data['temperature']

    # Traffic features (from Google Maps API or custom traffic model)
    features['traffic_level'] = order_data['traffic_level'].map({'low': 1, 'medium': 2, 'high': 3})

    # Historical features (restaurant average prep time)
    features['avg_restaurant_prep_time'] = order_data['restaurant_id'].map(
        historical_prep_times  # precomputed lookup table
    )

    return features

# Train model
X = extract_features(historical_orders)
y = historical_orders['actual_delivery_time']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = xgb.XGBRegressor(
    n_estimators=200,
    max_depth=6,
    learning_rate=0.1,
    objective='reg:squarederror'
)

model.fit(X_train, y_train)

# Predict ETA for new order
predicted_eta = model.predict(X_test)
print(f"Mean Absolute Error: {np.mean(np.abs(y_test - predicted_eta))} minutes")
# Output: Mean Absolute Error: 2.3 minutes (industry benchmark: 3-4 minutes)

Real-world impact:

Prediction accuracy: ±2 minutes for 80% of orders
Assignment optimization: Reduced average delivery time from 35 min → 28 min
Customer satisfaction: Late delivery rate dropped from 25% → 8%

2. Demand Forecasting: Time-Series Analysis

SQL: Analyzing historical demand patterns

query.sqlSQL

-- Hourly order volume by zone (used for forecasting)
WITH hourly_orders AS (
  SELECT
    zone_id,
    zone_name,
    DATE_TRUNC('hour', order_time) as hour,
    DATE_PART('dow', order_time) as day_of_week,  -- 0=Sunday, 6=Saturday
    DATE_PART('hour', order_time) as hour_of_day,
    COUNT(*) as order_count,
    COUNT(DISTINCT customer_id) as unique_customers,
    AVG(order_value) as avg_order_value
  FROM orders
  WHERE order_time >= CURRENT_DATE - INTERVAL '90 days'
  GROUP BY 1,2,3,4,5
)

SELECT
  zone_name,
  hour_of_day,
  day_of_week,
  AVG(order_count) as avg_orders,
  STDDEV(order_count) as stddev_orders,
  MAX(order_count) as peak_orders,
  -- Predict required delivery partners (1 partner can handle 2 orders/hour)
  CEIL(AVG(order_count) / 2.0) as required_partners
FROM hourly_orders
WHERE day_of_week IN (0, 6)  -- Weekends only
GROUP BY 1,2,3
ORDER BY zone_name, day_of_week, hour_of_day;

Python: Demand forecasting with Prophet

code.pyPython

from prophet import Prophet
import pandas as pd

# Load historical order data
df = pd.read_sql("""
  SELECT
    DATE_TRUNC('hour', order_time) as ds,
    COUNT(*) as y
  FROM orders
  WHERE zone_id = 'BLR_KORAMANGALA'
    AND order_time >= CURRENT_DATE - INTERVAL '180 days'
  GROUP BY 1
  ORDER BY 1
""", connection)

# Add external regressors (weather, holidays)
df['is_raining'] = df['ds'].map(weather_data)  # 1 if raining, 0 otherwise
df['is_cricket_match'] = df['ds'].map(cricket_schedule)  # 1 if India match, 0 otherwise

# Train forecasting model
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True
)
model.add_regressor('is_raining')
model.add_regressor('is_cricket_match')
model.fit(df)

# Forecast next 7 days
future = model.make_future_dataframe(periods=24*7, freq='H')  # Hourly for next 7 days
future['is_raining'] = get_weather_forecast(future['ds'])  # From weather API
future['is_cricket_match'] = get_cricket_schedule(future['ds'])

forecast = model.predict(future)
print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(24))  # Next 24 hours

Business impact:

Fleet optimization: Reduced idle time by 60% (saves ₹200 crore/year in partner payouts)
Customer wait time: Reduced from 15 min → 5 min during peak hours
Surge pricing accuracy: Predicts demand spikes 2 hours in advance (enables dynamic pricing)

⚠️ CheckpointQuiz error: Missing or invalid options array

📈

Key Results & Impact

1. Delivery Time Reduction

Before analytics:

Average delivery time: 35 minutes
Late delivery rate: 25%
Refund costs: ₹50 crore/year
Customer satisfaction (CSAT): 3.2/5

After analytics:

Average delivery time: 28 minutes (20% improvement)
Late delivery rate: 8% (67% reduction)
Refund costs: ₹16 crore/year (68% savings)
Customer satisfaction (CSAT): 4.1/5 (28% improvement)

2. Fleet Optimization

Before analytics:

Fixed fleet size: 250K delivery partners active throughout the day
Idle time: 40% (10-11 AM, 3-6 PM)
Peak-hour wait time: 15 minutes (insufficient partners during lunch/dinner)
Annual partner payouts: ₹500 crore (including idle time)

After analytics:

Dynamic fleet allocation: 180K partners during off-peak, 300K during peak
Idle time: 15% (optimal utilization)
Peak-hour wait time: 5 minutes (right-sized fleet)
Annual partner payouts: ₹380 crore (24% savings while improving service)

3. Revenue Impact

Personalized restaurant recommendations:

Browse time reduced from 8 min → 5 min (faster decision-making)
Conversion rate improved from 20% → 25% (5 percentage points)
Order frequency increased from 2.5 → 3.2 orders/month/user (better recommendations)
Net revenue impact: ₹400 crore additional GMV/year

Demand-based surge pricing:

Implemented dynamic pricing during peak hours (1.2× - 1.5× normal price)
Reduced customer wait time (incentivizes more partners to come online)
Net revenue impact: ₹150 crore additional revenue/year

Info

ROI of analytics team: Swiggy's 200-person analytics team costs ~₹100 crore/year in salaries + infrastructure. Combined savings + revenue impact = ₹700+ crore/year. 7× return on investment.

💡

What You Can Learn from Swiggy

1. Geospatial Analytics is a Superpower for Location-Based Businesses

Key insight: Swiggy's competitive advantage isn't just technology — it's their ability to model the chaotic Indian traffic/weather/restaurant ecosystem with data.

How to apply this:

If you work in logistics/delivery: Learn geospatial SQL (PostGIS), distance calculations (Haversine formula), and map visualization (Folium, Mapbox)
Build a sample project: Analyze Uber trip data, optimize delivery routes for an imaginary food delivery app, or predict taxi demand by zone
Portfolio value: Geospatial skills are rare in India — showcasing a project with maps + route optimization instantly stands out

2. Real-Time Analytics Requires Different Tools Than Batch Analytics

Key insight: Swiggy can't wait 24 hours for a daily report to assign delivery partners — they need second-by-second decision-making.

The two types of analytics:

| Batch Analytics | Real-Time Analytics | |---------------------|-------------------------| | SQL query runs overnight, results ready next morning | Query runs in <100ms, results used immediately | | Tools: SQL, Python, dbt, Airflow | Tools: Kafka, Flink, Redis, Elasticsearch | | Use case: Monthly revenue report, cohort analysis | Use case: Fraud detection, dynamic pricing, ETA prediction | | Example: "Which restaurants had highest sales last month?" | Example: "Which delivery partner should I assign to this order right now?" |

How to learn real-time analytics:

Start with SQL window functions (running totals, moving averages)
Learn event-driven architecture (how Kafka works)
Build a streaming project (real-time stock price tracker, live cricket score dashboard)

3. Domain Knowledge > Fancy Algorithms

Key insight: Swiggy's ETA model isn't state-of-the-art AI — it's XGBoost (a 10-year-old algorithm). The magic is in feature engineering:

Knowing that rain increases delivery time by 40% (domain knowledge)
Knowing that restaurant X always takes 12 minutes to prepare food (historical data)
Knowing that Koramangala traffic is 3× worse at 8 PM than 3 PM (local knowledge)

How to build domain knowledge:

When analyzing Zomato/Swiggy data, order food yourself and note patterns (delivery time, restaurant ratings, surge pricing)
When analyzing e-commerce data, browse Flipkart/Amazon and observe their recommendation logic
When building a portfolio project, pick an industry you understand (cricket, food, movies) rather than generic datasets

The best analysts aren't just good at SQL/Python — they understand the business deeply.

⚠️ FinalQuiz error: Missing or invalid questions array

⚠️ SummarySection error: Missing or invalid items array

Received: {"hasItems":false,"isArray":false}