Microsoft has been a big player in the data science industry after Azure and it’s machine learning tools have been slowly dominating as the biggest service provider in the cloud-computing market. As a result, Microsoft has been building out its data science team slowly but surely over the past five years to become one of the biggest companies hiring for the role.

The Data Scientist Role

The role of a data scientist at Microsoft varies a lot and is dependent on whichever team you interview with. Each Microsoft data science job is different and spans from analytics-based roles to more machine learning heavy. As a huge multi-conglomerate corporation, Microsoft has different teams that work on speech and language, artificial intelligence, machine learning infrastructure on Azure, data science consulting for cloud computing, and much more.

Required Skills

Microsoft generally prefers to hire experienced candidates with about a minimum of 2+ years of experience working in data science for a mid-level role. General qualifications are a Ph.D. in a quantitative field and some years of experience in any one of these fields (DNN, NLP, time series, reinforcement learning, network analysis, or causal inference).

  • Previous experience in DNN, NLP, time series, reinforcement learning, network analysis, or causal inference or any related areas
  • Proficiency in any of the following numerical programming languages (Python/Numpy/Scipy, R, SQL, C#, or Spark)
  • Experience with cloud-based architectures such as AWS or Azure

What are the types of data scientists?

Microsoft has a department under engineering that is called data and applied science. Employees in this department are often placed in teams and go by three main titles: Data scientists, applied scientists, and machine learning engineers. Depending on the team their functions would include:

  • Writing codes to ship models to production.
  • Writing codes for machine learning algorithms to be used by other data scientists.
  • Working with customers directly or indirectly to resolve technical issues.
  • Working on metrics and experimentation.
  • Working on product features.

The ideal candidate for the Microsoft Data and Applied Scientist role is expected to be able to apply a breadth of machine learning tools and analytical techniques to answer a wide range of high-impact business questions and present the insights in a concise and effective manner.

The Microsoft Data Scientist Interview

Azure ML

Initial Screen

After submitting your application for the job, the first phone interview may or may not be a recruiter depending on the seniority level of the role. Many times the hiring manager will conduct a 30 minute interview first to understand your past experience.

Expect this part of the phone interview to come in two parts. You will be asked about your background and projects as well as a few technical interview questions. The technical interview questions will be more theoretical along the lines of explaining how a machine learning concept works or a quick probability or statistical problem.


  • What’s the difference between lasso and ridge regression?
  • How would you explain how a deep learning model works to a business person?
  • How would you define a p-value to someone who’s non-technical?

The Technical Screen

After the hiring manager screen, the recruiter will schedule a second more technical screen with a Microsoft data scientist. Generally this screen is 45 minutes to an hour and designed to test pure technical skills and how well you can code and explain your thought-process.

The technical screen consists of around three different questions covering the topics of algorithms, SQL coding, and probability and statistics. Expect questions akin to data structures and algorithms in Python along with data processing type questions.


  • Given an array of words and a max width parameter, format the text such that each line has exactly X characters.
  • Write a query to randomly sample a row from a table with 100 million rows.
  • What’s the probability that you roll at least two 3s when rolling three die?

Prepare yourself for the technical interview by practicing real interview questions from Microsoft on Interview Query.

Raining in Seattle — Interview Query probability problem
You are about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it's raining.

The Onsite Interview

The onsite interview consists of a full day event from 9 am to 4 pm. You will meet with five different data scientists and go on a lunch interview as well.

Here’s what the interview panel generally looks like:

  • Probability and statistics
  • Data structures and algorithms
  • Modeling and machine learning systems
  • Hiring manager and behavioral interview
  • Data manipulation
  • You’ll also spend 1:1 time with one or two data scientists during a lunch break to learn more about Microsoft and the team. This is usually a one hour lunch interview that they’ll let you take a break or talk through what they work on.

The onsite interview will be mostly a combination of all the different technical concepts. Remember to study different model assessment metrics in different circumstances, the bias/variance tradeoff of coefficients under collinearity, open-ended questions about sampling schemes, experimental and ab testing design, explaining p-values to a 5 year old, different concepts of bayes theorem, and teaching the interviewer a statistical learning technique of your choice.

Another big focus for Microsoft is on communication, since the data science team at Microsoft has partnerships throughout the organization to ensure the team is doing useful work.

You can find many of the data structures and algorithm questions on Interview Query or Leetcode. It’s also advisable to get a white-board to practice writing code on given how different coding on a whiteboard versus the computer.

Sample Microsoft Data Science Interview Questions

  • How would you select a representative sample of search queries from six million?
  • Find the maximum of sub sequence in an integer list?
  • Give an example of a scenario where you would use Naive Bayes over another classifier?
  • How would you explain what MapReduce does as concise as possible?
  • What is the ROC curve and the meaning of sensitivity, specificity, confusion matrix?
  • The autocomplete feature: How would you implement it and can you highlight the flaws in this tool today?
  • Describe efficient ways to merge a given k sorted arrays of size n each.

Check out Interview Query for more data scientist interview questions.