Written by Austin Gorsuch


If you’re applying for a job in the field of data science, data engineering, or business analytics, there’s a good chance that you’ll have to work with SQL at one point or another. Knowing how to manage and query your own data is incredibly important in these fields. If you’re new to SQL, or need to brush off a rusty set of skills to address the current marketplace, it can be difficult to figure out what you need to know about SQL in order to be competitive in your field.

That being said, here are the basics:

Invented in 1974, SQL is a programming language designed to interface with Relational Database Management Systems (RDBMS). An RDBMS structures data into tables made up of rows (called “records”) and columns (called “fields”), which can then be queried, combined, and manipulated in various ways to achieve a desired outcome.

MySQL, Microsoft SQL Server, Oracle, Amazon Redshift, and IBM DB2 are all examples of RDBMS-based software. The chances that you will encounter one (or more) of these technologies in the course of your career or interview is pretty high. So, let’s make sure you’re prepared.

Subsets of SQL

There are three different subsets of SQL:

  • Data Definition Language (or DDL), which allows you to CREATE, ALTER, or DROP tables entirely
  • Data Manipulation Language (or DML), which allows you to SELECT or UPDATE existing data, or INSERT new data into an existing table
  • Data Control Language (or DCL), which allows you to GRANT or REVOKE access to a database

The most commonly used subset of SQL is Data Manipulation Language. The most commonly used command in SQL is SELECT.  For the most part, unless you are a data engineer, you will use SELECT to make your queries and perform basic manipulations on a database without having to worry about the actual creation of a database or the task of managing permissions.

Basic Queries

Unfortunately, the ability to SELECT all the data from a given table won’t get us very far in SQL. Luckily, SQL has lots of built-in tools to help us narrow and structure our queries so that we can get only the data that is relevant to our purposes. Such tools include the following four clauses:

  • WHERE (used to filter records based on specific criteria)
  • GROUP BY (used to group records with identical data)
  • ORDER BY (used to sort records according to a specific field in ASCending or DESCending order)
  • HAVING (used in combination with the GROUP BY clause in order to filter records).

It’s important to familiarize yourself with these basic querying clauses and understand when it’s appropriate to use them in order to structure your queries to return useful results.

Sample Question:

  • What is the difference between the WHERE clause and the HAVING clause?

Looking for more advanced SQL problems? Check out our Ultimate SQL Guide for more practice.

Common Operators

SQL also contains three types of basic operators that allow you to further narrow down your queries: arithmetic operators, comparison operators, and logical operators. Arithmetic operators should be familiar to you if you have basic knowledge of mathematics.

  • + (for addition)
  • - (for subtraction)
  • * (for multiplication)
  • / (for division)

Comparison operators should also be familiar. They include:

  • = (equal to)
  • != (not equal)
  • <> (not equal)
  • < (less than)
  • > (greater than)
  • <= (less than or equal to)
  • >= (greater than or equal to)
  • !< (not less than)
  • !> (not greater than)

Logical operators can be a bit trickier to get a handle on, but SQL is prized for its intuitive syntax, which is closer to ordinary speech compared to other programming languages. Logical operators include:

  • ISNULL
  • ALL
  • ANY
  • AND
  • BETWEEN
  • NOT
  • LIKE
  • IN
  • UNIQUE

Since these operators are highly contextual, it’s difficult to explain them in abstract terms, but if you familiarize yourself with all of them, you’ll be on solid footing when it comes to your basic SQL interview.

Sample Question:

+---------+--------------+-----------------+
| user_id | date_visited | purchase_amount |
+---------+--------------+-----------------+
| 10001   | 2021-01-20   | 30.00           |
| 10002   | 2021-01-25   | 300.00          |
| 10003   | 2021-02-01   | 3000.00         |
| 10004   | 2021-01-14   | 15.00           |
| 10005   | 2021-01-13   | 150.00          |
| 10006   | 2021-01-01   | 1500.00         |
.....
  • Given the above table, how would you structure a query to return all users who visited the site on or before January 15th?

Set Operators

Sometimes, it is useful to be able to compare the results of two separate SELECT queries, often across different tables. In these cases, SQL has built-in set operators that help us compare the results of our queries with as little effort as possible. These set operators include:

  • UNION (which combines the results of two separate queries without duplicates)
  • UNION ALL (which combines the results of two separate queries, including duplicates)
  • MINUS (which returns the result of one query with the common results of the second query “subtracted” from it)
  • INTERSECT (which returns only the results that appeared in both queries)  

It’s important to familiarize yourself with these operators, but even more important to develop a sense of which operator is the appropriate solution for a given situation.

Sample Question:

  • What is the difference between UNION and UNION ALL? When might you prefer using one over the other?

Joins

Image from Google Developers

Often, rather than querying two separate tables with a set operator, we’ll want to actually merge the two tables into one for ease of comparison. In this case, we’ll find ourselves performing a JOIN. There are four commonly used JOINs in SQL:

  • INNER JOIN (which returns records with matching values from both tables)
  • LEFT JOIN (which returns all the records of the first table and any matching records from the second table)
  • RIGHT JOIN (which returns all the records from the second table and any matching records from the first table; the “opposite” of a LEFT JOIN)
  • FULL JOIN (which returns all records with matching values from either the first or the second table)

It’s important to be able to know, when looking at two different tables, what sort of data becomes available when you perform a given JOIN. Developing a sense of when each JOIN is appropriate, and what sort of analysis each makes available, will help you when using your SQL toolkit in a real world setting.  

Sample Question:

employees    
+---------------+---------+     
| id            | int     |
| first_name    | varchar |     
| last_name     | varchar |     
| salary        | int     |     
| department_id | int     |--+  
+---------------+---------+  |  
                             |
departments                  |
+---------------+---------+  |  
| id            | int     |<-+  
| name          | varchar |     
+---------------+---------+   
Write a SQL query to select the 2nd highest salary in the engineering department.
If more than one person shares the highest salary, the query should select the next highest salary.

Here's a hint:

First, we need the name of the department to be associated with each employee in the employees table, to understand which department each employee is a part of.
The “department_id” field in the employees table is associated with the “id” field in the departments table. We call the “department_id” a foreign key because it is a column that references the primary key of another table, which in this case is the “id” field in the departments table.
Based on this common field, we can join both tables, using INNER JOIN, to associate the name of the department name to the employees that are a part of those departments.

Try solving this problem in our interactive SQL editor on Interview Query.

Aggregate Functions

Sometimes, rather than retrieving an individual record or set of records, you’ll be asked to retrieve specific features of a table’s given field. In this case, you’ll want to apply one of SQL’s built-in aggregate functions. They’re called aggregate functions because they deal with data in “aggregate,” or as a group, rather than individually. Examples of aggregate functions include:

  • COUNT (or the total number of records in a specific table)
  • COUNT DISTINCT (or the total number of unique records in a specific table)
  • AVG (or the average of a set of values)
  • MIN (or the lowest of a set of values)
  • MAX (or the highest of a set of values)
  • SUM (or the sum of a set of values)
  • FIRST (or the first of a set of values)
  • LAST (or the last of a set of values)

Being able to recognize situations where aggregate functions can help you narrow down a query is a useful skill that comes with time and practice.

Sample Question:

  • You're given a table that represents search results from searches on Facebook. The query column is the search term, position column represents each position the search result came in, and the rating column represents the human rating of the search result from 1 to 5 where 5 is high relevance and 1 is low relevance.

1. Write a query to compute a metric to measure the quality of the search results for each query.

2. You want to be able to compute a metric that measures the precision of the ranking system based on position. For example, if the results for dog and cat are....

...we would rank 'cat' as having a better search result ranking precision than 'dog' based on the correct sorting by rating.

Write a query to create a metric that can validate and rank the queries by their search result precision.

Search Ratings — Interview Query sql problem
`search_results` table columntype queryvarchar result_idinteger positioninteger ratinginteger You&#39;re given a table that represents
Try it on Interview Query

The intention of this article is to outline in broad strokes some of the basic knowledge required to pass a SQL interview. But the reality is that abstract knowledge can only go so far. It doesn’t matter if you know how to perform an INNER JOIN if you don’t know when you should, or the difference between a WHERE and a HAVING clause without knowing how to apply that information to a real world situation.

For more practice, visit us at Interview Query to try our built-in SQL editor and real interview questions from top companies, like Google, Facebook, Netflix, and more.

Example Easy SQL Interview Questions

+---------+-------------+-------------+
| user_id | date_joined | date_posted |
+---------+-------------+-------------+
| 10001   | 2021-01-01  | 2021-01-01  |
| 10002   | 2021-01-01  | 2021-01-05  |
| 10003   | 2021-01-02  | 2021-01-02  |
| 10004   | 2021-01-01  | 2021-01-20  |
..... 
  • Given the above table, what command would you use to add a new record to a database?
  • Given the above table, how would you structure a query to retrieve all of the entries from a single field?
  • Given the above table, what command would you use to eliminate the existing table?
  • Given the above table, how would you structure a query to return only users that joined on January 1st?
  • Given the above table, how would you structure a query to return the last date any user joined the website?
+---------+--------------+-----------------+
| user_id | date_visited | purchase_amount |
+---------+--------------+-----------------+
| 10001   | 2021-01-20   | 30.00           |
| 10002   | 2021-01-25   | 300.00          |
| 10003   | 2021-02-01   | 3000.00         |
| 10004   | 2021-01-14   | 15.00           |
| 10005   | 2021-01-13   | 150.00          |
| 10006   | 2021-01-01   | 1500.00         |
.....
  • Given the above table, how would you structure a query to return all users who visited after a certain date AND spent more than 100 dollars?
  • Given the above table, how would you structure a query to return all users who spent exactly 15 dollars on the website?
  • Given the above table, how would you structure a query to determine the total amount spent by all visitors to your website?
  • Given the above table, how would you structure a query to determine the highest amount spent by any individual visitor to your website? The lowest amount?
  • What is the difference between COUNT and COUNT DISTINCT? When might you want to use one and not the other?
'users' table

+-------------+----------+
| column      | type     |
+-------------+----------+
| user_id     | integer  |
| name        | varchar  |
| date_joined | datetime |
+-------------+----------+

'subscribers' table

+-----------------+----------+
| column          | type     |
+-----------------+----------+
| user_id         | integer  |
| name            | varchar  |
| date_subscribed | datetime |
+-----------------+----------+
  • Given the above tables, how would you structure a query to return the list of names of people who are users but not subscribers?
  • Given the above tables, how would you structure a query to return the list of names of people who are both users and subscribers?
  • Given the above tables, how would you structure a query to return matching records from the users table while preserving ALL the records from the subscribers table?
  • Given the above tables, how would you structure a query to return records with matching values from either table?
  • What’s the difference between a LEFT JOIN and an INNER JOIN? When might you prefer one to the other?