Table of Contents

  1. Technical Questions
  2. Behavioral Questions
  3. Onsite Interview: Tips & Tricks
  4. Example Interview Question + Solution

The Amazon Machine Learning Interview Process

Amazon's interview process consists of three stages: the initial screen, the technical interview, and then finally the onsite interview.

For the machine learning engineer role, depending on which career track between machine learning scientists, research scientists, applied scientists, and engineers that work on some sort of machine learning capacity, the interview and questions received will be slightly different.

Technical Interview Questions

The technical interview questions that will be asked for the machine learning role at Amazon will be a combination of theoretical ML concepts and programming. The interviewer will ask you a series of questions on fundamental machine learning concepts, like explanations of different machine learning models, bias-variance tradeoff, and overfitting.

Example Technical Interview Questions:

  • (Coding) Given an Array of numbers & a target value, return indexes of two numbers such that their Absolute difference is equal to the target
  • (Coding) Given two dates D1 & D2. count number of days, months?
  • (Coding) Find 1st missing positive number (must do in O(1) memory & O(n) time)
  • (Machine Learning) How do to find thresholds for a classifier?
  • (Machine Learning) What’s the difference between logistic regression and support vector machines? What's an example of a situation where you would use one over the other?
  • (Machine Learning) Explain ICA and CCA. How do you get a CCA objective function from PCA?
  • (Machine Learning) What is the relationship between PCA with a polynomial kernel and a single layer autoencoder? What if it is a deep autoencoder?
  • (Machine Learning) What is "random" in random forest? If you use logistic regression instead of a decision tree in random forest, how will your results change?
  • (Modeling) What is the interpretation of an ROC area under the curve as an integral?
  • (Coding) Given an array a, return the indices i,j that minimize |a_i -a_j|

Behavioral Interview Questions

In the onsite interview, you should expect to be asked behavioral questions that are specific to Amazon's leadership principle questions. Make sure you know Amazon’s 14 leadership principles!

Example Behavioral Interview Questions:

  • (LP Question): How do you deal with good quality when delivering to customers?
  • Why are you leaving your current job?
  • How do you handle conflict with team members?

Onsite Interview: Tips and Tricks

The onsite consists of five rounds of interviews. These interviews are composed of a mixture of behavioral, software engineering, and machine learning questions.

The interview panel will look like:

  • Behavioral and leadership question interview with a hiring manager.
  • Whiteboard coding interview with a software engineer.
  • Technical machine learning system design question with a data/applied scientist. Example question such as design a computer vision algorithm to improve image search.
  • Technical interview with a machine learning scientist on modeling + machine learning algorithms.
  • Technical discussion about past work with a data/applied scientist.

Tips & Tricks

  • Make sure you review both machine learning and programming concepts. A machine learning engineer is more of a software engineer than a data scientist, so you should expect a number of coding questions in the technical rounds.
  • Amazon assesses every applicant based on their 14 leadership principles. Don’t just learn these 14 principles — embody them throughout the interview and make it evident that you’ve demonstrated these principles in your past experiences.

Example Amazon Machine Learning Interview Question and Solution

Amazon Machine Learning Infrastructure

Let's say you have a categorical variable with thousands of distinct values, how would you encode it?

This depends on whether the problem is a regression or a classification model.

If it's a regression model, one way would be to cluster them based on the response variable by working backwards. You could sort them by the response variable, and then split the categorical variables into buckets based on the grouping of the response variable. This could be done by using a shallow decision tree to reduce the number of categories.

Another way given a regression model would be to target encode them. Replace each category in a variable with the mean response given that category. Now you have one continuous feature instead of a bunch of categories.

For a binary classification, you can target encode the column by finding the conditional probability of the response variable being a one, given that the categorical column takes a particular value. Then replace the categorical column with this numerical value. For example if you have a categorical column of city in predicting loan defaults, and the probability of a person who lives in San Francisco defaults is 0.4, you would then replace "San Francisco" with 0.4.

Additionally if working with classification model, you could try grouping them by the category's frequency. The most frequent categories may dominate in the total make-up and the least frequent may make up a long tail with a few samples each. By looking at the frequency distribution of the categories, you could find the drop-off point where you could leave the top X categories alone and then categorize the rest into an "other bucket" giving you X+1 categories.

If you want to be more precise, total the categories that give you the 90 percentile in the cumulative and dump the rest into the "other bucket".

Lastly we could also try using a Louvain community detection algorithm. Louvain is a method to extract communities from large networks without setting a pre-determined number of clusters like K-means.

Check out the Amazon machine learning engineer and data scientist interview guide.