Getting ready for an Machine Learning Engineer interview at Amazon? The Amazon Machine Learning Engineer interview span across 10 to 12 different question topics. In preparing for the interview:
Interview Query regularly analyzes interview experience data, and we’ve used that data to produce this guide, with sample interview questions and an overview of the Amazon Machine Learning Engineer interview.
Typically, interviews at Amazon vary by role and team, but commonly Machine Learning Engineer interviews follow a fairly standardized process across these question topics.
Amazon’s interview process consists of three stages: the initial screen, the technical interview, and then finally the onsite interview.
For the machine learning engineer role, depending on which career track between machine learning scientists, research scientists, applied scientists, and engineers that work on some sort of machine learning capacity, the interview and questions received will be slightly different.
The technical interview questions that will be asked for the machine learning role at Amazon will be a combination of theoretical ML concepts and programming. The interviewer will ask you a series of questions on fundamental machine learning concepts, like explanations of different machine learning models, bias-variance tradeoff, and overfitting.
In the onsite interview, you should expect to be asked behavioral questions that are specific to Amazon’s leadership principle questions. Make sure you know Amazon’s 14 leadership principles
The onsite consists of five rounds of interviews. These interviews are composed of a mixture of behavioral, software engineering, and machine learning questions.
The interview panel will look like:
Make sure you review both machine learning and programming concepts. A machine learning engineer is more of a software engineer than a data scientist, so you should expect a number of coding questions in the technical rounds.
Practice for the Amazon Machine Learning Engineer interview with these recently asked interview questions.
Let’s say you have a categorical variable with thousands of distinct values, how would you encode it?
This depends on whether the problem is a regression or a classification model.
If it’s a regression model, one way would be to cluster them based on the response variable by working backwards. You could sort them by the response variable, and then split the categorical variables into buckets based on the grouping of the response variable. This could be done by using a shallow decision tree to reduce the number of categories.
Another way given a regression model would be to target encode them. Replace each category in a variable with the mean response given that category. Now you have one continuous feature instead of a bunch of categories.
For a binary classification, you can target encode the column by finding the conditional probability of the response variable being a one, given that the categorical column takes a particular value. Then replace the categorical column with this numerical value. For example if you have a categorical column of city in predicting loan defaults, and the probability of a person who lives in San Francisco defaults is 0.4, you would then replace “San Francisco” with 0.4.
Additionally if working with classification model, you could try grouping them by the category’s frequency. The most frequent categories may dominate in the total make-up and the least frequent may make up a long tail with a few samples each. By looking at the frequency distribution of the categories, you could find the drop-off point where you could leave the top X categories alone and then categorize the rest into an “other bucket” giving you X+1 categories.
If you want to be more precise, total the categories that give you the 90 percentile in the cumulative and dump the rest into the “other bucket”.
Lastly we could also try using a Louvain community detection algorithm. Louvain is a method to extract communities from large networks without setting a pre-determined number of clusters like K-means.
See more Amazon machine learning interview questions from Interview Query:
Average Base Salary
Average Total Compensation
Read interview experiences and salary posts in preparation for your next interview.