Data Analytics Knowledge Quiz
What This Quiz Covers:
- SQL (15 questions): SELECT, JOINs, GROUP BY, window functions, CTEs
- Python/Pandas (15 questions): Data manipulation, groupby, merge, cleaning
- Statistics (10 questions): Mean/median, correlation, hypothesis testing, distributions
- Excel/BI Tools (5 questions): Pivot tables, VLOOKUP, Power BI DAX
- Business Analytics (5 questions): RFM, cohorts, KPIs, metrics
Scoring:
- 40-50 correct (80-100%): Excellent — ready for interviews
- 35-39 correct (70-79%): Good — review weak areas
- 30-34 correct (60-69%): Fair — more practice needed
- <30 correct (<60%): Keep learning — review fundamentals
How to Use This Quiz:
- Take honestly: Don't Google answers (simulates interview pressure)
- Note wrong answers: Review explanations for questions you missed
- Identify patterns: SQL weak? Python strong? Focus study accordingly
- Retake monthly: Track progress over time
Sample Quiz Questions
SQL Questions (Sample):
Q1: What does LEFT JOIN return?
- A) Only matching rows from both tables
- B) All rows from left table + matching from right ✓
- C) All rows from both tables
- D) Only rows where join condition is true
Explanation: LEFT JOIN keeps all rows from left table, adds matching data from right table (NULL if no match). Use for "find all customers and their orders (including customers with 0 orders)".
Q2: Which SQL function calculates running total?
- A) SUM() GROUP BY
- B) SUM() OVER (ORDER BY date) ✓
- C) COUNT(*) PARTITION BY
- D) AVG() HAVING
Explanation: Window function SUM() OVER (ORDER BY...) calculates cumulative sum. GROUP BY gives total per group (not running). ORDER BY in window creates cumulative range.
Python Questions (Sample):
Q3: Fastest way to multiply two columns?
- A) for loop with iterrows()
- B) df.apply(lambda row: row['a'] * row['b'])
- C) df['result'] = df['a'] * df['b'] ✓
- D) SQL query
Explanation: Vectorized operations (df['a'] * df['b']) are 10-100× faster than apply/loops. Operates on arrays in C, not Python row-by-row.
Q4: How to handle 30% missing values in column?
- A) Always fill with 0
- B) Drop column or impute with median, depends on criticality ✓
- C) Ignore and analyze anyway
- D) Fill with mean always
Explanation: 30% missing is borderline. If column is critical (like customer_id), collect more data. If non-critical (like optional field), drop or impute with median. Never auto-fill with 0 (false data).
Statistics Questions (Sample):
Q5: When to use median instead of mean?
- A) When data has outliers or is skewed ✓
- B) When sample size is small
- C) Only for categorical data
- D) Never, mean is always better
Explanation: Median resists outliers. Example: Salaries [₹5L, ₹6L, ₹5.5L, ₹50L] → Mean ₹16.6L (misleading), Median ₹5.75L (typical). Use median for skewed distributions.
Q6: What does p-value < 0.05 mean?
- A) 5% probability result is due to chance (statistically significant) ✓
- B) 95% probability hypothesis is true
- C) Effect size is 5%
- D) Sample size is too small
Explanation: p-value = probability of seeing results IF null hypothesis (no effect) is true. p < 0.05 = unlikely due to chance, reject H₀. Doesn't mean effect is large, just statistically significant.
Excel/BI Questions (Sample):
Q7: What does VLOOKUP do?
- A) Validates data quality
- B) Looks up value in table by matching first column ✓
- C) Creates pivot tables
- D) Calculates variance
Explanation: VLOOKUP(lookup_value, table, col_num, FALSE) finds value in first column of table, returns value from specified column. Like SQL JOIN on single key.
Q8: In Power BI, what's DAX used for?
- A) Data import
- B) Creating calculated columns and measures ✓
- C) Visualization design
- D) Data transformation (that's Power Query)
Explanation: DAX = Data Analysis Expressions. Creates metrics (Total Revenue = SUM(Sales)), calculated columns (Profit = Revenue - Cost). Power Query (M language) for data transformation.
Business Analytics Questions (Sample):
Q9: What does RFM stand for?
- A) Revenue, Frequency, Margin
- B) Recency, Frequency, Monetary ✓
- C) Retention, Funnel, Metrics
- D) Return, Forecast, Model
Explanation: RFM segments customers by: Recency (days since last purchase), Frequency (# orders), Monetary (total spend). High RFM = best customers (Champions).
Q10: What's a good engagement rate for social media posts?
- A) 50% is minimum
- B) 1-5% is average; >5% is excellent ✓
- C) 100 likes per post
- D) Higher is always suspicious
Explanation: Engagement rate = (Likes + Comments + Shares) / Followers × 100. 1-5% average (varies by size: smaller accounts 5-10%, large accounts 1-3%). Absolute numbers don't matter without follower context.
⚠️ CheckpointQuiz error: Missing or invalid options array
Quiz-Taking Strategy
During the Quiz:
✅ Read questions carefully: "Which is NOT true" vs "Which is true" ✅ Eliminate wrong answers: Narrow to 2 choices, then pick best ✅ Skip and return: Stuck? Mark for review, move on (don't waste time) ✅ Trust first instinct: Changing answers usually makes score worse ✅ No penalty for guessing: Eliminate obvious wrong choices, guess from remaining
After the Quiz:
✅ Review ALL explanations: Even correct answers (confirm understanding) ✅ Note patterns: SQL JOINs weak? Business metrics strong? ✅ Create study plan: Focus on categories <70% ✅ Retake in 2 weeks: Measure improvement after focused study
Converting Quiz Score to Real Skills:
| Quiz Score | Interview Readiness | Action Plan | |------------|---------------------|-------------| | >80% | Ready | Practice behavioral questions, portfolio review | | 70-79% | Almost ready | Review weak areas (1-2 weeks), then apply | | 60-69% | Need more prep | Focused study on weak topics (2-4 weeks) | | <60% | Keep learning | Review fundamentals (4-8 weeks), build projects |
If You Score Low, Study These
SQL (<70% on SQL questions):
- Topics to review: JOINs, GROUP BY + HAVING, window functions, CTEs
- Practice: LeetCode SQL (50 easy + 25 medium)
- Resource: SQL SELECT, WHERE, ORDER BY
Python (<70% on Python questions):
- Topics to review: pandas groupby, merge, apply, vectorization
- Practice: Python Practice Problems
- Resource: Pandas DataFrames Tutorial
Statistics (<70% on stats questions):
- Topics to review: Mean/median, correlation, p-values, distributions
- Practice: Calculate metrics by hand (don't just read)
- Resource: Statistics for Data Analysts
Business Analytics (<70% on business questions):
- Topics to review: RFM, cohorts, KPIs, metrics definitions
- Practice: Build RFM analysis project
- Resource: Customer Lifetime Value
Overall Strategy:
If you score <70% overall:
- Week 1-2: Review fundamentals (SQL basics, pandas basics)
- Week 3-4: Practice problems (LeetCode SQL, pandas exercises)
- Week 5-6: Build 2-3 portfolio projects
- Week 7-8: Mock interviews, retake quiz (target 80%+)
Don't rush interviews with <70% quiz score — fix gaps first, apply confidently later.
⚠️ FinalQuiz error: Missing or invalid questions array
⚠️ SummarySection error: Missing or invalid items array
Received: {"hasItems":false,"isArray":false}