Throughout the course of any data-based career, upskilling becomes a necessity but isn’t the easiest of tasks. When transitioning from tools like Excel to more complex and less visual tools like SQL, learning to recognize familiar patterns can help translate your current skills into new techniques and technologies.
One of the most-used functions by Excel specialists is the lookup function VLOOKUP. In SQL, however, there’s not quite a direct equivalent. How are you then supposed to look up values?
In this article, we’ll explore other options and discuss why VLOOKUP isn’t possible– or even a necessity– in SQL.
Let’s start by breaking down the logic of VLOOKUP and translating these operations to SQL.
Your typical VLOOKUP function has the following arguments:
lookup_value: The value you want to search for in the leftmost column of the range or table.
table_array: The range of cells containing the data you want to search.
col_index_num: The index of where we’ll search for our value.
range_lookup: (optional) If set to TRUE or if not specified, an approximate match will be performed. If you set it to FALSE, an exact match will be required.
Starting with just the required values, let’s look at an example:
Our VLOOKUP function can be expressed as the following:
“In the table
A3:C8, find the name (represented by
col_index_num = 2) of the employee with
EmpID = 3(from the value of the
This would return the string “Arianna”.
💡 Note: in VLOOKUP,
lookup_value can either be a cell reference or a value. For example, we could change it to
VLOOKUP(4, A3:C8, 2), and it will return “Racquel”.
Using SQL, solving this problem is a bit simpler. Let’s consider the following table:
Considering “Arianna” for our example again, we can use the following query:
SELECT Fname from employees where EmpId = 3;
Unlike Excel, there are no well-defined positions in SQL tables, so it’s not necessary to specify the table array or the cell reference.
EmpID won’t necessarily be the leftmost column, and “Farhan” may not be the last
Fname in the column. Instead, we’ll deal with direct values.
Let’s break down this query:
table_array. Here, we’re referring to
Fnamefrom the Employees table.
In VLOOKUP, the condition is that the
lookup_value of the data at
col_index_num should be the same. Given the table above, the lookup value is
3, so we should retrieve the
Fname with an
EmpID equal to 3. The WHERE clause here does the same.
In Excel, VLOOKUP is commonly used to retrieve data from different sheets. With SQL, we instead use the
JOIN operation, which allows you to access and combine data from multiple tables.
Joins can be a challenging concept to grasp if you’re new to SQL. Put into simple terms, JOINs allow you to access data from other tables.
An operation that matches rows of two different tables and appends them.
JOIN also has a matching process that allows it to return the appropriate data (or record) based on a condition.
Let’s look at the following for an example:
Suppose that, given an employee name, you want to retrieve their department. How would you do this in VLOOKUP and SQL?
Using VLOOKUP requires a two-step process. We’d need to find the
Department_ID of the employee from the Employees table first and then use this to find the department name in the Departments table.
Our full VLOOKUP solution should look like this:
VLOOKUP(VLOOKUP("name", Employees!B2:D6, 3, FALSE), Departments!A2:B4, 2, FALSE)
Our solution starts with looking up the
Department_ID. Then, we’ll conduct a secondary VLOOKUP to find the corresponding department name using the returned id.
VLOOKUP("Farhan", Employees!B2:D6, 3, FALSE) ' returns Farhan's department_id, '1'
VLOOKUP(1, Departments!A2:B4, 2, FALSE) ' the '1' here is from the value in STEP 1.
This would return “Human Resources”.
To solve this problem using SQL, we’d use a JOIN operation between the Employees table and the Departments table based on a condition. In this case, our condition is where the
Department_ID of the two tables matches.
Our solution would then look like the following:
SELECT Departments.Department_Name FROM Employees JOIN Departments ON Employees.Department_ID = Departments.Department_ID WHERE Employees.Fname = 'name';
Broken down into sections, we have:
department_namevalue from the
Departmentstable. Unlike our first SQL example above, we need to be explicit about which table we’re taking our values from. While the SQL server might be able to infer this, it’s good practice to include them in the query for readability.
Employeestable. In this instance, we’re getting the department name from an employee name.
Fnameis a field of the table
Employees, so our FROM clause is based on this table.
FROMcan take in more than one table, but, in this case, it’s best to stick to one for efficiency.
Before this point, our query was operating on the following set with the
FROM Employees clause:
Now that we’ve added the JOIN operation, our set now includes the following:
This set is the Cartesian product of the two tables. A Cartesian product of two sets is the combination of all possible pairs formed by taking one element from each set.
However, our solution isn’t quite finished at this point. This is when we perform the matching process, using the ON clause.
ON Employees.Department_ID = Departments.Department_ID
This clause eliminates all the rows in the Cartesian product that don’t meet the match requirement, which is for the department id of the first set to match with the second set.
Using the last column as an example,
Employees.Department_ID is ‘1’, while the
Departments.Department_ID is ‘3’. This would not be included in the joined set.
After applying the ON clause, we should get the following set:
With all of the unnecessary data eliminated, we can now clearly see the relationship between the employees and their departments.
The final WHERE clause narrows down the set to a specific employee name. If we’re using
**"Farhan"** as an example, this would select the row with the statement
WHERE employees.Fname = "Farhan".
With our SELECT clause, only the department name (”Human Resources”) will be returned, just like VLOOKUP.
While Excel’s VLOOKUP function and SQL’s JOIN operation are both capable of retrieving and matching data across multiple tables or sheets, their syntax and usage vary depending on the context. SQL provides more powerful and flexible operations, while Excel offers a more intuitive interface for non-programmers. To continue your SQL journey, learn more using our SQL learning path.
The best way to get better in SQL is practice. Interview Query offers a variety of SQL learning resources to help you practice and improve your SQL skills, including: