By Vincent Yabor
At Amazon, one of the most well-known, highly sought-after FAANG companies, you have the chance to use your skills to help improve one of the most successful businesses in the world. Acing the Amazon interview is no easy feat. In this article, we'll be focusing on getting you through the SQL portion of the Amazon interview.
Amazon’s interviews are not unlike other tech companies. The overall process is as follows:
- Initial Phone Screen
- Technical Interview
- Onsite Interview
The initial screen is a phone interview where your resume and the position will be discussed. This is a lower stakes interview, more of a 'getting to know you' than an actual test of your abilities.
During the technical interview, you should expect at least a few questions centered around the specifics of the job. For example, a data science technical screen consists of coding, algorithms, and machine learning. For most data science positions, at least one of these questions will be SQL-based.
Finally, the onsite interview will consist of several steps with interviewers. Here, you will be tested on more thorough and difficult concepts and coding questions. In addition, the interviewers will be looking for your understanding of Amazon's 14 leadership principles.
Let's dig into a few common roles where you might be asked SQL questions during an Amazon interview.
A business analyst is primarily responsible for helping a business achieve its goals by analyzing data to generate business-related insights. A business analyst will run analyses, maintain reports, and build dashboards. The following infographic compares the skills which are tested in the Amazon interview with skills tested in a typical business analyst interview.
As you can see, SQL is tested almost twice as heavily in the Amazon interview versus the average interview.
According to an Amazon business analyst from Glassdoor, the Amazon interview consists of a technical phone screen, a behavioral phone screen, and an exhaustive third round which consists of several interviews throughout the day. Expect to encounter SQL-related questions during the first and third rounds.
In particular, during the phone screen, one should expect basic SQL questions on
GROUP BY, and
JOIN. These typically start out simple in nature and are followed by basic queries. The final round of interviews includes advanced SQL questions using
JOIN, sub-queries, and complex queries.
You should be able to verbally explain the different kinds of joins as well as the difference between the
WHERE statement and
HAVING statement. Further, you should be familiar with deriving insights from given tables. This includes exploring the data by finding various metrics such as totals and metrics over a given time. As far as particular SQL questions go, Amazon really likes to ask their business analysts about joins.
Business Intelligence (BI) Engineer
At Amazon, BI engineers work in teams to form business decisions based on collected data with the intention of improving the customer experience. BI engineers work to improve Prime, Alexa, Twitch, and more. Here is a breakdown of skills tested in an Amazon BI engineer interview.
Compared to all other domains, SQL is by far the most thoroughly emphasized skill in an Amazon BI engineer interview.
The interview process involves two phone screenings and several onsite interviews. SQL queries have been known to come up during the first phone screen and the first two onsite interviews. People have reported seeing SQL questions centered around
GROUP BY, and aggregate functions.
Typically, the phone-based SQL questions involve describing aggregate functions and finding the approximate number of rows in an output table after a
JOIN. You shouldn't expect to write more advanced queries until the onsite interview.
An example question that one particular BI engineer faced is:
An Amazon data analyst performs important functions such as data interpretation, the building and maintaining of dashboards, system design for data collection and compiling, and more.
The specifics of the role can vary depending on what team you are working with. Data analysts who work with Twitch data may perform vastly different functions than those working with Alexa data. The graph below shows how Amazon tests data analyst skills like SQL compared to other companies.
The interview process for a data analyst position involves three levels: behavioral, analytical, and technical interviews. The behavioral and technical interviews are conducted over the phone.
For the technical interview, you must also share your screen with the interviewer as you answer some questions. To start, you have your typical conceptual SQL questions such as differentiating between
HAVING clauses. Later in the process, you can expect more query writing questions.
Much like the other positions, data scientists at Amazon perform different roles depending on which branch of the company their work is focused on. A data scientist works with data to produce models and solutions in machine learning and natural language applications in order to make predictions and provide forecasting insights. Here is the comparison graph of Amazon data scientists’ tested skills versus average data scientists’ tested skills.
Since data science as a profession is conceptually deep and diverse, more topics are tested overall. This leaves less room for SQL, although it is still a major skill in data science. Amazon tests you in SQL slightly more than the average employer.
Hopefully you are seeing a pattern with these interview questions. The interviewers always start out with basic SQL syntax. Then they hit you with a few solvable queries. Here is an example of one:
Given a table with three columns (id, category, value), with each id having three or fewer categories (price, size, color), find all ids for which two or more category values match.
To learn more about the Twitch Data Scientist Interview, check out this article on Interview Query.
Example Amazon SQL Interview Questions
- Types of Joins
Explain the difference between the types of joins.
Hint: Try thinking of the question as a series of Venn diagrams.
2. Purchase History
Consider the following two tables:
+------------+---------------+------------+------------+---------------+ |customer_id | purchase_date | product_id | unit_price | unit_purchased| +------------+---------------+------------+------------+---------------+ | 10001 | 2021-02-01 | 35525 | 25.00 | 4 | | 10002 | 2021-02-02 | 30321 | 10.00 | 8 | | 10003 | 2021-02-14 | 35525 | 25.00 | 3 | | 10004 | 2021-02-05 | 34826 | 300.00 | 1 | | 10005 | 2021-02-23 | 30321 | 10.00 | 5 | .....
+------------+-------------------+ |customer_id | registration_date | +------------+-------------------+ | 10001 | 2020-10-10 | | 10002 | 2021-01-31 | | 10003 | 2021-02-10 | | 10004 | 2021-01-15 | | 10005 | 2021-02-15 | .....
a. Write a query to retrieve data with unique customer ids that made over $100 in purchases during the first week of February.
b. Write a query to retrieve data with unique customer ids that made over $100 in purchases within ten days of registering.
3. Comments Histogram
Consider the following tables:
Write a SQL query to create a histogram of the number of comments per user in the month of January 2020. Assume bin buckets class intervals of one.
'users' table +-----------------+----------+ | columns | type | +-----------------+----------+ | id | integer | | name | string | | created_at | datetime | | neighborhood_id | integer | | mail | string | +-----------------+----------+ 'comments' table +------------+----------+ | columns | type | +------------+----------+ | user_id | integer | | body | text | | created_at | datetime | +------------+----------+
Here's a hint:
What does a histogram represent? In this case we're interested in using a histogram to represent the distribution of comments each user has made in January 2020.
A histogram with bin buckets of size one means that we can avoid the logical overhead of grouping frequencies into specific intervals.
For example, if we wanted a histogram of size five, we would have to run a SELECT statement like so:
SELECT CASE WHEN frequency BETWEEN 0 AND 5 THEN 5 WHEN frequency BETWEEN 5 AND 10 THEN 10 etc.
Test your abilities on this question on Interview Query.
Write a query to create a new table, named flight routes, that displays unique pairs of two locations.
Duplicate pairs from the flights table, such as Dallas to Seattle and Seattle to Dallas, should have one entry in the flight routes table.
'flights' table +----------------------+---------+ | column | type | +----------------------+---------+ | id | integer | | source_location | string | | destination_location | string | +----------------------+---------+
5. Cumulative Reset
Given a users table, write a query to get the cumulative number of new users added by day, with the total reset every month.
'users' table +------------+----------+ | columns | type | +------------+----------+ | id | int | | name | varchar | | created_at | datetime | +------------+----------+
Here's a hint:
This question first seems like it could be solved by just running a COUNT(*) and grouping by date. Or maybe it's just a regular cumulative distribution function?
But we have to notice that we are actually grouping by a specific interval of month and date. And that when the next month comes around, we want to the reset the count of the number of users.
Tangentially aside - the practical benefit for a query like this is that we can get a retention graph that compares the cumulative number of users from one month to another. If we have a goal to acquire 10% more users each month, how do we know if we're on track for this goal on February 15th without having the same number to compare it to for January 15th?
Therefore how can we make sure that the total amount of users in January 31st rolls over back to 0 on February 1st?
You can see the solution to this question on Interview Query.
Regardless of the position, SQL is a common feature of the Amazon interview. It’s best to be familiar with basic syntax all the way through complex queries and sub-queries. Doing well in this portion of the interview could make the difference between getting the job or being asked to look elsewhere.
The interview process can feel long and intimidating for many. But it doesn’t need to be like that! We offer the guidance and tips that are needed to ensure you ace your interview with little stress and get started with your dream job.