Data Analyst Interview Questions & Answers

Data Analyst Interview Questions & Answers (2026) – Top 30 Questions with Expert Answers

Table of Contents

⚡ AI Overview (Quick Answer)

To crack a data analyst interview in 2026, you must demonstrate strong SQL skills, Excel proficiency, statistics fundamentals, data visualization (Power BI/Tableau), and business problem-solving through case studies. Recruiters test technical ability plus your skill at translating messy data into business decisions. This guide covers 30+ real data analyst interview questions with model answers, a salary table, the exact interview structure, and an FAQ section optimized for quick revision.


Introduction: The Most In-Demand “Non-Coding” Tech Job of 2026

Picture this: a retail company’s sales dipped 12% last quarter, and nobody knows why. The CEO doesn’t want opinions — she wants evidence. The person who walks into that boardroom with a clean dashboard, three validated hypotheses, and one clear recommendation is the data analyst. That’s the job. And that’s exactly what interviewers are testing when you sit across from them.

The beautiful thing about data analyst interviews? They follow a remarkably consistent pattern across companies — from MNCs like Deloitte, Accenture, and EY to product firms like Flipkart, Swiggy, and Zomato. The not-so-beautiful thing? Candidates routinely fail not on hard questions, but on fundamentals: a JOIN they couldn’t write, a p-value they couldn’t explain, a VLOOKUP scenario they fumbled.

This guide fixes that. Below are the 30 most asked data analyst interview questions and answers, organized by interview round, with tables, scenario walk-throughs, and links to deepen each skill area.

Interview RoundWhat Is TestedWeightage
Round 1 – Screening / AptitudeLogical reasoning, basic Excel & SQL MCQs15%
Round 2 – SQL & TechnicalJoins, aggregations, window functions35%
Round 3 – Case Study / BusinessMetrics thinking, root-cause analysis30%
Round 4 – Tools & VisualizationPower BI/Tableau, dashboard logic10%
HR RoundCommunication, projects, salary10%

Since SQL alone carries the biggest weight, we strongly recommend revising our dedicated Basic SQL Interview Questions & Answers guide alongside this article.


Basic Data Analyst Interview Questions (Q1–Q10)

Q1. What does a data analyst do, in your own words?

Answer: A data analyst collects, cleans, analyzes, and visualizes data to help organizations make evidence-based decisions. The job has four repeating phases: (1) understanding the business question, (2) gathering and cleaning relevant data, (3) analyzing it using SQL, Excel, or Python, and (4) communicating insights through dashboards and reports. The strongest answers end with impact: “Ultimately, my job is to convert raw data into decisions that save money or grow revenue.”

Q2. What is the difference between data analysis, data analytics, and data science?

Answer: Data analysis is examining historical data to answer “what happened and why” — descriptive and diagnostic. Data analytics is the broader umbrella including processes, tools, and governance around analysis. Data science extends into predictive and prescriptive territory using machine learning — “what will happen and what should we do.” A data analyst reports the past accurately; a data scientist models the future probabilistically.

Q3. Walk me through your data cleaning process.

Answer: A structured model answer: (1) Profile the data — row counts, column types, summary statistics; (2) Handle missing values — decide between deletion, mean/median imputation, or business-rule filling, depending on why data is missing; (3) Remove duplicates after defining what makes a row unique; (4) Fix inconsistencies — date formats, “Mumbai” vs “MUMBAI”, trailing spaces; (5) Treat outliers — investigate before deleting, since outliers are sometimes the insight; (6) Validate — totals reconcile with source systems. Data cleaning consumes 60–80% of an analyst’s real time, and interviewers know it.

Q4. What is the difference between structured and unstructured data?

Answer: Structured data fits neatly into rows and columns with a fixed schema — sales tables, customer databases — and is queried with SQL. Unstructured data has no predefined format — emails, social media posts, images, call recordings — and requires specialized processing (text mining, NLP). Semi-structured data (JSON, XML) sits in between with flexible tags. Roughly 80% of enterprise data is unstructured, which is why this distinction matters.

Q5. Which Excel functions should a data analyst master?

Answer: The interview-favorite list: VLOOKUP/XLOOKUP (joining data across sheets), INDEX-MATCH (flexible lookups), SUMIFS/COUNTIFS/AVERAGEIFS (conditional aggregation), IF with AND/OR (logic), TEXT functions (LEFT, RIGHT, TRIM, CONCAT for cleaning), Pivot Tables (instant summarization), and conditional formatting. In 2026, also mention Power Query for repeatable cleaning workflows — it signals you work like a professional, not a student.

Q6. VLOOKUP vs INDEX-MATCH vs XLOOKUP — which do you prefer?

Answer:

FunctionLookup DirectionSpeed on Big DataColumn Insert Safe?
VLOOKUPLeft to right onlySlowerNo (breaks on insert)
INDEX-MATCHAny directionFasterYes
XLOOKUPAny directionFastYes

Model answer: “XLOOKUP where available — it handles both directions, defaults to exact match, and has built-in if-not-found. INDEX-MATCH as the robust fallback on older Excel versions.”

Q7. What are the measures of central tendency, and when does each mislead?

Answer: Mean (average) — distorted by outliers: one billionaire in a room skews average income; Median (middle value) — robust to outliers, ideal for salary and house-price data; Mode (most frequent) — best for categorical data like most-sold product size. Strong candidates volunteer the misleading cases without being asked.

Q8. Explain the difference between correlation and causation.

Answer: Correlation means two variables move together; causation means one drives the other. Ice cream sales and drowning deaths correlate — both are caused by summer, not each other (a confounding variable). Analysts must resist causal language unless supported by controlled experiments like A/B tests. This question tests analytical maturity more than statistics.

Q9. What is data validation and why does it matter?

Answer: Data validation ensures data meets quality rules before analysis — type checks (dates are dates), range checks (age between 0–120), uniqueness checks (no duplicate order IDs), referential checks (every order links to a real customer), and cross-field logic (delivery date ≥ order date). One unvalidated dataset can silently corrupt every dashboard built on it — a story worth telling if you have one.

Q10. Which tools are on a modern data analyst’s toolbelt in 2026?

Answer: SQL (non-negotiable), Excel/Google Sheets (still everywhere), Power BI or Tableau (visualization), Python with Pandas (increasingly expected), and cloud data warehouses (BigQuery, Snowflake, Redshift) for large datasets. Familiarity with AI-assisted analysis tools is becoming a plus. If cloud platforms appear in the job description, skim our Cloud Computing Interview Questions guide too.


SQL Interview Questions for Data Analysts (Q11–Q18)

Q11. Explain the different types of SQL JOINs.

Answer: INNER JOIN returns only matching rows in both tables; LEFT JOIN returns all left-table rows with NULLs where the right side has no match; RIGHT JOIN mirrors that; FULL OUTER JOIN returns everything from both sides; CROSS JOIN returns the Cartesian product; SELF JOIN joins a table to itself (employees to their managers). Be ready to predict row counts: “1,000 customers LEFT JOIN 800 orders gives at least 1,000 rows” — counting questions trip up most candidates.

Q12. What is the difference between WHERE and HAVING?

Answer: WHERE filters individual rows before grouping; HAVING filters groups after GROUP BY aggregation. You cannot use aggregate functions in WHERE. Example: “show cities with more than 100 orders” requires GROUP BY city HAVING COUNT(*) > 100. Execution order to memorize: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT.

Q13. How do you find duplicate records in a table?

Answer: Group by the columns defining a duplicate and filter: SELECT email, COUNT(*) FROM customers GROUP BY email HAVING COUNT(*) > 1; To delete duplicates while keeping one, use a window function: assign ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) and delete where the row number exceeds 1. This two-part answer (find + remove) is what separates good from average candidates.

Q14. Explain window functions with an example.

Answer: Window functions compute values across related rows without collapsing them like GROUP BY does. Essentials: ROW_NUMBER(), RANK(), DENSE_RANK() for ranking; LAG()/LEAD() for comparing with previous/next rows (month-over-month growth); SUM() OVER (PARTITION BY … ORDER BY …) for running totals. The classic test: “Find the second-highest salary per department” — DENSE_RANK() OVER (PARTITION BY dept ORDER BY salary DESC) filtered to rank 2.

Q15. RANK() vs DENSE_RANK() vs ROW_NUMBER()?

Answer: Given salaries 100, 90, 90, 80: ROW_NUMBER gives 1,2,3,4 (arbitrary tie order); RANK gives 1,2,2,4 (skips after ties); DENSE_RANK gives 1,2,2,3 (no gaps). Choose based on how the business defines “Nth highest” — and say so, because asking the clarifying question is itself a strong signal.

Q16. What are subqueries and CTEs, and which do you prefer?

Answer: A subquery is a query nested inside another; a CTE (Common Table Expression, WITH cte AS (…)) is a named temporary result set defined before the main query. CTEs win for readability, reuse within the same statement, and breaking complex logic into steps — and they enable recursion. Model preference: “CTEs for anything beyond one level of nesting; my queries read like a story: clean → aggregate → rank → select.”

Q17. How would you calculate month-over-month revenue growth in SQL?

Answer: Aggregate revenue by month in a CTE, then use LAG(revenue) OVER (ORDER BY month) to fetch the previous month and compute (revenue – prev_revenue) * 100.0 / prev_revenue. Mention edge cases — the first month has no prior (NULL handling) and division-by-zero protection — because production-minded answers win offers. Practice variants of this freely on Mode’s SQL tutorial or LeetCode’s database section.

Q18. What is the difference between DELETE, TRUNCATE, and DROP?

Answer: DELETE removes selected rows (WHERE-able, logged, rollback-able, triggers fire); TRUNCATE removes all rows instantly (minimal logging, resets identity, no WHERE); DROP removes the entire table structure itself. Quick mnemonic: DELETE = erase some writing, TRUNCATE = tear out all pages, DROP = burn the notebook.


Statistics, Visualization & Case Study Questions (Q19–Q30)

Q19. Explain p-value in simple terms.

Answer: The p-value is the probability of observing results at least this extreme if the null hypothesis were true. A p-value of 0.03 means: “If there were truly no effect, we’d see data like this only 3% of the time.” Below the significance threshold (commonly 0.05), we reject the null. Crucially, p-value is not the probability the hypothesis is true — stating this distinction earns instant credibility.

Q20. What is A/B testing and how would you design one?

Answer: A/B testing compares two versions (control vs variant) by randomly splitting users and measuring a success metric. Design steps: (1) define one primary metric (e.g., checkout conversion); (2) state hypotheses; (3) calculate required sample size for desired statistical power; (4) randomize properly and run for full business cycles (avoid weekday bias); (5) analyze significance and — importantly — practical significance: a statistically significant 0.01% lift may not justify engineering cost.

Q21. What are Type I and Type II errors?

Answer: Type I error (false positive) — rejecting a true null hypothesis: declaring the new design better when it isn’t. Type II error (false negative) — failing to detect a real effect. Memory trick: Type I is “crying wolf,” Type II is “missing the wolf.” The significance level α controls Type I risk; statistical power (1−β) controls Type II.

Q22. How do you handle outliers?

Answer: First detect (boxplots/IQR rule: outside Q1−1.5×IQR or Q3+1.5×IQR; or z-scores beyond ±3), then investigate before acting — is it a data-entry error, a system glitch, or a genuine extreme like a Diwali sales spike? Errors get corrected or removed; genuine extremes stay (perhaps analyzed separately or handled with median-based metrics). “I never delete outliers blindly” is the sentence interviewers wait for.

Q23. Which chart would you choose for different scenarios?

Answer:

GoalBest Chart
Trend over timeLine chart
Compare categoriesBar chart
Part-to-whole (few segments)Pie/donut (≤5 slices) or stacked bar
Relationship between 2 numeric variablesScatter plot
Distribution of one variableHistogram / box plot
Geographic comparisonFilled map

Bonus point: “I avoid pie charts beyond five categories and never use 3D charts — they distort perception.”

Q24. What makes a dashboard effective in Power BI or Tableau?

Answer: (1) Built around the audience’s decisions, not available data; (2) KPI summary on top, drill-downs below (inverted pyramid); (3) five-second rule — the headline insight should be graspable in five seconds; (4) consistent colors with meaning (red = bad everywhere); (5) interactive filters instead of fifty static charts; (6) performance-optimized data models (star schema, measures over calculated columns in Power BI). Walk the interviewer through one dashboard you actually built.

Q25. Explain the difference between a fact table and a dimension table.

Answer: A fact table stores measurable events — sales transactions with amounts, quantities, and foreign keys. Dimension tables store descriptive context — customer details, product attributes, dates. Together they form a star schema, the standard model for analytics because it keeps queries fast and logic simple. Knowing this signals you can work with real data warehouses, not just flat Excel files.

Q26. Case study: Daily active users dropped 15% last week. How do you investigate?

Answer: Use a structured funnel: (1) Validate the data — tracking bug? definition change? (most “drops” are instrumentation issues); (2) Segment — by platform, geography, user cohort, acquisition channel — is the drop universal or isolated?; (3) Check internal changes — releases, pricing changes, feature removals on the timeline; (4) Check external factors — seasonality, holidays, competitor launches, outages; (5) Quantify and recommend. Interviewers grade the structure, not a lucky guess.

Q27. How would you measure the success of a new feature?

Answer: Define metrics across layers: adoption (% of users trying it), engagement (frequency/depth of use), retention impact (do adopters return more?), business impact (revenue, conversion), and guardrails (did support tickets or churn rise?). Tie each metric to a target before launch — measuring after the fact invites bias.

Q28. Data analyst vs business analyst vs data engineer — what’s the difference?

Answer: The data analyst analyzes data to answer business questions (SQL, BI tools, statistics). The business analyst focuses on processes and requirements, bridging stakeholders and tech teams (less hands-on with data). The data engineer builds the pipelines and warehouses that deliver reliable data to analysts (heavy coding, ETL, infrastructure). Career-wise, analysts commonly grow into senior analyst, analytics manager, data scientist, or product analyst paths — a question that often follows in managerial rounds, much like those covered in our Product Manager Interview Questions guide.

Q29. Do data analysts need Python?

Answer: Increasingly yes, at least Pandas basics: reading files, filtering, groupby aggregations, merges, and simple plots. Python takes over where Excel struggles (millions of rows, repeatable pipelines, API data pulls). A balanced 2026 answer: “SQL is my primary tool, Excel for stakeholder collaboration, Python for automation and heavy lifting.”

Q30. Tell me about a project where your analysis changed a decision.

Answer: Use the STAR format: Situation (declining repeat purchases at an e-commerce client), Task (find the cause), Action (cohort analysis in SQL revealed users acquired via one discount campaign churned 3× faster; built a Power BI dashboard to track cohort quality), Result (“marketing reallocated 20% of budget; repeat purchase rate rose 8% in two quarters”). Always quantify the result — numbers are an analyst’s native language, and your interview story should prove it.


Data Analyst Salary in India (2026)

ExperienceRoleAverage Annual Salary (INR)
0–2 yearsJunior / Associate Data Analyst₹4 – 7 LPA
2–5 yearsData Analyst₹7 – 14 LPA
5–8 yearsSenior Analyst / Analytics Lead₹14 – 25 LPA
8+ yearsAnalytics Manager₹25 – 45+ LPA

Indicative ranges; product companies and fintech typically pay 20–40% above services firms.


7 Tips to Crack the Data Analyst Interview

  1. SQL until it’s reflex. Solve 50+ problems covering joins, aggregations, and window functions — this single skill decides most outcomes.
  2. Build 2 portfolio projects with public datasets (Kaggle) — one ending in a dashboard, one in a written insight report.
  3. Practice case studies aloud. Structure beats brilliance: validate → segment → hypothesize → quantify.
  4. Know your metrics vocabulary: DAU/MAU, retention, churn, conversion, AOV, CAC, LTV.
  5. Prepare one “data cleaning horror story” — every interviewer loves these and they prove real experience.
  6. Rehearse explaining a technical concept to a non-technical person — communication is half the job.
  7. Don’t skip HR preparation — our HR Interview Questions for Freshers guide covers the final hurdle.

For ongoing practice, the Kaggle datasets library and Google’s free Data Analytics resources are excellent companions.


📚 Recommended Resources (Affiliate Section)

ResourceBest ForLink
Google Data Analytics Certificate (Coursera)Structured beginner path[Enroll Now → Affiliate Link]
“Storytelling with Data” (Book – Cole Knaflic)Visualization & communication[Check Price on Amazon → Affiliate Link]
SQL for Data Analysis (Udemy)Interview-grade SQL practice[Enroll Now → Affiliate Link]
Power BI Complete CourseDashboard skills[Enroll Now → Affiliate Link]

Affiliate Disclosure: Some of the links above are affiliate links, meaning Interview Questions Hub may earn a small commission if you purchase through them — at no extra cost to you. We only recommend resources we genuinely believe accelerate your preparation.


Frequently Asked Questions (FAQ)

Q1. Can a fresher become a data analyst without a computer science degree? Absolutely. Data analytics is one of the most degree-flexible tech careers — commerce, statistics, and engineering graduates all succeed. What matters is demonstrable SQL + Excel + visualization skill, proven through portfolio projects.

Q2. Is coding required for data analyst jobs? SQL is essential (and yes, it counts as coding). Python is increasingly preferred but rarely mandatory for entry roles. Heavy software engineering is not required — that’s the data engineer’s domain.

Q3. How long does it take to prepare for data analyst interviews? From a basic foundation, 8–12 weeks of structured preparation: four weeks SQL, two weeks Excel + statistics, two weeks Power BI/Tableau with a portfolio project, and two weeks of case study and mock interview practice.

Q4. Which certification helps most for data analyst roles? Certificates open screening doors but don’t replace skills. The Google Data Analytics Certificate and Microsoft PL-300 (Power BI) are the most recognized in 2026 — pair either with two strong portfolio projects.

Q5. What is the most common reason candidates fail data analyst interviews? Weak SQL under pressure, followed by unstructured case-study answers. Both are fixable with deliberate practice — which is precisely what this guide is designed for.


Conclusion

Data analyst interviews reward structured thinkers who can write clean SQL, question their own data, and translate numbers into decisions. Master the 30 questions above, build the two portfolio projects, and practice narrating your analysis out loud. Combine this guide with our SQL interview questions deep-dive and HR round preparation, and you’ll walk into your interview with the one thing every analyst needs most: confidence backed by evidence. Your dashboard-driven career starts now!


Disclaimer: The interview questions, answers, salary figures, and preparation timelines in this article are compiled for educational purposes from commonly reported interview experiences and public market data. Actual interview content, processes, and compensation vary by company, role, and location. Interview Questions Hub does not guarantee job placement or specific outcomes. All trademarks and product names belong to their respective owners.

Leave a Reply

Your email address will not be published. Required fields are marked *