
The job description for “data scientist” has never been particularly stable. The role has spent years absorbing responsibilities from adjacent fields, including analytics, ML engineering, and data engineering, while the industry debates what a data scientist actually is. In 2026, the job postings are starting to settle the argument.
A new analysis published this week by Ask Data Dawn examined 101 data science job postings in early 2026 and compared skill requirements directly to 2025. The results show a market that has moved decisively in one direction: toward ownership of the full data stack, not just the modeling layer. SQL appeared in 79% of 2026 postings, up from 61% in 2025. ETL and pipeline experience rose 18 percentage points. dbt and Snowflake each grew by 9-10 points.
The shift signals something structural. Companies aren’t looking for people who can build models after someone else cleans the data. They want data scientists who can own the infrastructure, pull it upstream, pipeline it, and deliver it without waiting on a data engineer. That expectation is now showing up in postings at scale.
According to the analysis, the biggest movers in data science job postings are all infrastructure-adjacent:
Legacy tools moved in the opposite direction. SAS dropped from 15% to 2%. MATLAB is near zero. Scala fell from 9% to 1%, reflecting the shift away from Hadoop-era architectures toward SQL-based cloud warehouses. R held at 41%, still significant but declining, with its strongest remaining foothold in biostatistics and research-adjacent roles.
The pattern is consistent with separate research from Harvard University and labor analytics firm Lightcast, which found that demand for data scientists is not declining, but the roles increasingly favor candidates with AI and infrastructure fluency, particularly at the senior level.
The era of the data scientist who waits for clean, pre-processed data is fading. Companies expect candidates to demonstrate fluency with the tools data engineers use, not just the tools analysts use.
That has direct implications for technical interviews. SQL questions in 2026 don’t stop at JOINs and window functions. Based on patterns seen across thousands of Interview Query users, companies at mid-to-senior levels now test for query optimization, warehouse-specific SQL dialects, and an understanding of how data lands in the systems candidates will query. A candidate who writes clean analytical SQL but can’t reason about table partitioning or incremental loads is leaving points on the table.
Practicing with questions that include schema design and pipeline reasoning, not just pure analytical queries, is increasingly the differentiator. Browse IQ’s SQL question bank to find questions organized by topic and difficulty, or work through data engineer interview questions to understand how infrastructure-adjacent thinking is tested.
The surge in A/B testing (+14 points) and causal inference (+17 points) is a different kind of signal. It suggests that data science teams at more companies have matured past descriptive work and are now expected to measure the causal impact of decisions.
Causal inference has moved from an academic specialty to a practical interview topic at companies with serious experimentation programs. Candidates who can explain experiment design, run a power analysis, and defend a quasi-experimental method when a true A/B test isn’t feasible are now competitive in a way they weren’t two or three years ago.
This matches what coaches in IQ’s network have observed firsthand. Candidates preparing for roles at companies like Amazon and DoorDash report that experimentation design questions are where technical screens increasingly live. IQ’s statistics and A/B testing question bank covers this territory in depth, from power analysis to quasi-experimental design.
The decline of SAS, MATLAB, and Scala is less a surprise than a confirmation. Python’s ecosystem has absorbed most of what these tools did well. Pandas, scikit-learn, and PyTorch handle the modeling and analysis workloads SAS once monopolized. Cloud SQL warehouses eliminated the use cases that kept Scala-based Spark architectures relevant.
R’s position at 41% of postings is worth watching. It hasn’t collapsed, and in biotech, pharma, and research-adjacent roles, it remains standard. But for candidates targeting industry data science roles, R fluency is a nice-to-have. Python is the default, and the gap is widening.
The practical takeaway from this data is about calibration, not alarm. Companies aren’t expecting data scientists to become data engineers. They’re expecting candidates to be uncomfortable with dependency on others for data access.
If your current preparation is heavy on modeling (regression, classification, evaluation metrics) and light on infrastructure and experimentation, 2026 posting trends suggest you’re leaving a gap. The companies hiring right now expect you to demonstrate that you can close it.
A strong candidate in 2026 can write production-quality SQL, explain how their data landed in the warehouse, and speak to how they’ve designed or evaluated an experiment. If you’re not sure whether your preparation covers that ground, a coaching session can surface specific gaps before they cost you the offer. Book a coaching session to pressure-test your readiness against where the market is.
While data science job market isn’t disappearing, the job description is changing faster than many candidates realize. New posting analysis from early 2026 shows SQL now in 79% of data science roles, data engineering tools surging, and causal inference jumping 17 points in a single year.
The data scientist who can navigate infrastructure, not just model clean data, is the one the market is pricing at a premium. The question isn’t whether this shift is real; the job postings confirm it is. The question is whether your interview preparation reflects it. Practice with IQ’s AI Interviewer to find out where you actually stand.