VLOOKUP in SQL

VLOOKUP in SQL

Throughout the course of any data-based career, upskilling becomes a necessity but isn’t the easiest of tasks. When transitioning from tools like Excel to more complex and less visual tools like SQL, learning to recognize familiar patterns can help translate your current skills into new techniques and technologies.

One of the most-used functions by Excel specialists is the lookup function VLOOKUP. In SQL, however, there’s not quite a direct equivalent. How are you then supposed to look up values?

In this article, we’ll explore other options and discuss why VLOOKUP isn’t possible– or even a necessity– in SQL.

VLOOKUP Equivalent in SQL

Let’s start by breaking down the logic of VLOOKUP and translating these operations to SQL.

Your typical VLOOKUP function has the following arguments:

  • lookup_value: The value you want to search for in the leftmost column of the range or table.
  • table_array: The range of cells containing the data you want to search.
  • col_index_num: The index of where we’ll search for our value.
  • range_lookup: (optional) If set to TRUE or if not specified, an approximate match will be performed. If you set it to FALSE, an exact match will be required.

Starting with just the required values, let’s look at an example:

VLOOKUP Equivalent in SQL

Our VLOOKUP function can be expressed as the following:

“In the table A3:C8, find the name (represented by col_index_num = 2) of the employee with EmpID = 3(from the value of the A6 cell).”

This would return the string “Arianna”.

? Note: in VLOOKUP, lookup_value can either be a cell reference or a value. For example, we could change it to VLOOKUP(4, A3:C8, 2), and it will return “Racquel”.

Using SQL, solving this problem is a bit simpler. Let’s consider the following table:

Employees

EmpID Fname Age
1 Erick 24
2 Rosa 26
3 Arianna 30
4 Racquel 32
5 Farhan 19

Considering “Arianna” for our example again, we can use the following query:

SELECT Fname from employees where EmpId = 3;

Unlike Excel, there are no well-defined positions in SQL tables, so it’s not necessary to specify the table array or the cell reference. EmpID won’t necessarily be the leftmost column, and “Farhan” may not be the last Fname in the column. Instead, we’ll deal with direct values.

Let’s break down this query:

  • SELECT determines what data to return. In this query, we’re returning a Fname.
  • FROM determines what table to use. In Excel, this may be a sheet or a table_array. Here, we’re referring to Fname from the Employees table.
  • WHERE is a conditional statement. It’s more flexible than VLOOKUP but essentially returns all the rows that make the condition return TRUE.

In VLOOKUP, the condition is that the lookup_value of the data at col_index_num should be the same. Given the table above, the lookup value is 3, so we should retrieve the Fname with an EmpID equal to 3. The WHERE clause here does the same.

Dealing with Multiple Sheets or Tables

In Excel, VLOOKUP is commonly used to retrieve data from different sheets. With SQL, we instead use the JOIN operation, which allows you to access and combine data from multiple tables.

The JOIN Operation

Joins can be a challenging concept to grasp if you’re new to SQL. Put into simple terms, JOINs allow you to access data from other tables.

From the JOIN section in Interview Query’s SQL learning path, JOINs can be described as:

An operation that matches rows of two different tables and appends them.

Just like VLOOKUP and WHERE, JOIN also has a matching process that allows it to return the appropriate data (or record) based on a condition.

Let’s look at the following for an example:

Employees

A B C D
1 EmpID Fname Age Department_ID
2 1 Erick 24 3
3 2 Rosa 26 3
4 3 Arianna 30 1
5 4 Racquel 32 2
6 5 Farhan 19 1

Departments

A B
1 Department_ID Department_Name
2 1 Human Resources
3 2 Accounting
4 3 Analytics

Suppose that, given an employee name, you want to retrieve their department. How would you do this in VLOOKUP and SQL?

VLOOKUP Solution

Using VLOOKUP requires a two-step process. We’d need to find the Department_ID of the employee from the Employees table first and then use this to find the department name in the Departments table.

Our full VLOOKUP solution should look like this:

VLOOKUP(VLOOKUP("name", Employees!B2:D6, 3, FALSE), Departments!A2:B4, 2, FALSE)

Our solution starts with looking up the Department_ID. Then, we’ll conduct a secondary VLOOKUP to find the corresponding department name using the returned id.

Step 1:

VLOOKUP("Farhan", Employees!B2:D6, 3, FALSE) ' returns Farhan's department_id, '1'

Step 2:

VLOOKUP(1, Departments!A2:B4, 2, FALSE) ' the '1' here is from the value in STEP 1.

This would return “Human Resources”.

SQL Solution

SQL Solution

To solve this problem using SQL, we’d use a JOIN operation between the Employees table and the Departments table based on a condition. In this case, our condition is where the Department_ID of the two tables matches.

Our solution would then look like the following:

SELECT Departments.Department_Name
FROM Employees
JOIN Departments
ON Employees.Department_ID = Departments.Department_ID
WHERE Employees.Fname = 'name';

Broken down into sections, we have:

  • SELECT: We’re selecting the department_name value from the Departments table. Unlike our first SQL example above, we need to be explicit about which table we’re taking our values from. While the SQL server might be able to infer this, it’s good practice to include them in the query for readability.
  • FROM: While we’re working with two tables in this query, there’s only one table specified in the FROM clause because our base query is deriving data from the Employees table. In this instance, we’re getting the department name from an employee name. Fname is a field of the table Employees, so our FROM clause is based on this table. FROM can take in more than one table, but, in this case, it’s best to stick to one for efficiency.
  • JOIN: This operation joins the table Departments to the Employee table.

Before this point, our query was operating on the following set with the FROM Employees clause:

EmpID Fname Age Department_ID
1 Erick 24 3
2 Rosa 26 3
3 Arianna 30 1
4 Racquel 32 2
5 Farhan 19 1

Now that we’ve added the JOIN operation, our set now includes the following:

EmpID Fname Age Department_ID Department_ID Department_Name
1 Erick 24 3 1 Human Resources
1 Erick 24 3 2 Accounting
1 Erick 24 3 3 Analytics
2 Rosa 26 3 1 Human Resources
2 Rosa 26 3 2 Accounting
2 Rosa 26 3 3 Analytics
3 Arianna 30 1 1 Human Resources
3 Arianna 30 1 2 Accounting
3 Arianna 30 1 3 Analytics
4 Racquel 32 2 1 Human Resources
4 Racquel 32 2 2 Accounting
4 Racquel 32 2 3 Analytics
5 Farhan 19 1 1 Human Resources
5 Farhan 19 1 2 Accounting
5 Farhan 19 1 3 Analytics

This set is the Cartesian product of the two tables. A Cartesian product of two sets is the combination of all possible pairs formed by taking one element from each set.

However, our solution isn’t quite finished at this point. This is when we perform the matching process, using the ON clause.

ON Employees.Department_ID = Departments.Department_ID

This clause eliminates all the rows in the Cartesian product that don’t meet the match requirement, which is for the department id of the first set to match with the second set.

Using the last column as an example, Employees.Department_ID is ‘1’, while the Departments.Department_ID is ‘3’. This would not be included in the joined set.

EmpID Fname Age Department_ID Department_ID Department_Name
5 Farhan 19 1 3 Analytics

After applying the ON clause, we should get the following set:

EmpID Fname Age Department_ID Department_ID Department_Name
1 Erick 24 3 3 Analytics
2 Rosa 26 3 3 Analytics
3 Arianna 30 1 1 Human Resources
4 Racquel 32 2 2 Accounting
5 Farhan 19 1 1 Human Resources

With all of the unnecessary data eliminated, we can now clearly see the relationship between the employees and their departments.

The final WHERE clause narrows down the set to a specific employee name. If we’re using **"Farhan"** as an example, this would select the row with the statement WHERE employees.Fname = "Farhan".

EmpID Fname Age Department_ID Department_ID Department_Name
5 Farhan 19 1 1 Human Resources

With our SELECT clause, only the department name (”Human Resources”) will be returned, just like VLOOKUP.

Conclusion

While Excel’s VLOOKUP function and SQL’s JOIN operation are both capable of retrieving and matching data across multiple tables or sheets, their syntax and usage vary depending on the context. SQL provides more powerful and flexible operations, while Excel offers a more intuitive interface for non-programmers. To continue your SQL journey, learn more using our SQL learning path.

Learn SQL with Interview Query

The best way to get better in SQL is practice. Interview Query offers a variety of SQL learning resources to help you practice and improve your SQL skills, including: