It’s no secret that businesses today are driven by data. Companies all over the world are collecting their customers’ data, analyzing that data, and using the insights derived from it to drive their business decisions. These decisions include what their product should look like, how that product should evolve over time, and how the company that makes the product should be structured.
But how does this data get from the consumer to businesses, or from the heart of a business to its head? How does data transform over time from raw input to actionable insight? That’s where database design and architecture enters the picture.
A professional involved in database design takes a database from its conceptual stages to the actual on-the-ground processing of vast amounts of data.
However, just because you don’t necessarily expect database design to be a dominant feature of your profession doesn’t mean that it won’t show up in your interview. We’ve broken down the database design interview into its most basic components: the basic concepts, the skills and qualifications you’ll be expected to have, and the types of database design questions you’re most likely to see during the interview.
What is Database Design?
In the simplest terms, database design is the process of creating a database from scratch.
Database design interview questions show up when applying for a role as a business intelligence engineer, data engineer, research scientist, or data scientist.
A professional involved in database design determines what data is going to be stored and, more importantly, where and how. They divide raw data into tables, define relationships between different entities in the data, and normalize data so that the end user (typically a business) can easily track, understand, and derive insights from the patterns contained within the data. A professional doing database design is also frequently concerned with concepts like:
- Database security
- Data replication
- Availability of data
- Database partitioning
- Data backup
A professional involved with database design therefore has a lot of responsibility as the person charged with all of the above; if she does her job poorly, the company has everything (more specifically, all of their data) to lose. Accordingly, let’s take a look at some of the skills and qualifications that accompany the role.
Database Design Skills and Qualifications
In general, someone performing database design should be familiar with SQL or a similar language for database management. For the conceptual stage of database design, where relationships are being mapped out between entities in a database, a professional involved in database design should be familiar with one of the following:
- SQL Server Database Modeler
After the conceptual design of the database has been completed, someone performing database design should be familiar with one of the following database management systems (DBMS) to implement the physical database:
- Oracle DB
- IBM DB2
- Microsoft SQL Server
You’ll recognize that the core skill, here, is SQL, since it’s the primary programming language a database design professional will use to bring the concept of a database into reality as a full-fledged physical database. A solid understanding of Data Definition Language, or DDL, is crucial. This technical skill, more than any other, is likely to be emphasized during the interview process, and sets the qualified candidate apart from their cohort.
Database Design Case Study Interview
The most common type of database design question that you’re likely to encounter is a case study question. This kind of question begins with a problem statement: how would you build a database for X feature on Y app? For instance, you might be asked how you would design a schema for song information on iTunes.
How do you handle a question like this? First, you want to figure out exactly what they’re asking.
What information, out of all possible information, do we actually care about when it comes to storing song information on iTunes? Do we care about albums, artist, song, song length, genre, or all of the above? Why? What are we looking to gain from storing song information? Are we collecting data for business purposes? Trying to build a recommendation engine? How much data do we have?
By clarifying these questions, we narrow the field of concerns we’re responsible for addressing while giving our answer. This allows us to approach the question with real depth instead of treating a great variety of possible subjects lightly. Once we have our bedrock of actionable information, we can build out the hypothetical schema based on fundamentals of database design.
You’re likely to encounter a case study question like this during the onsite interview section of the interview process, after you’ve already passed the phone and technical screens.
Other Types of Database Design Interview Questions
Explain X concept / What is the difference between X and X?
This type of interview question asks you to demonstrate your understanding of basic database design concepts, such as entity relationship modeling, the normalization forms, and similar concepts.
You may be asked about the differences between logical, conceptual, and physical database models, or the advantages of normalization compared to denormalization. Whatever the question, you’ll want to keep your answers short and simple. Demonstrate your understanding of the topic and then allow the interviewer to move on to the next question.
Scenario Driven Questions
A scenario driven question follows the form of:
- “What would you do if confronted with X?”
- “Tell me about a time when…”
With questions like these, an interviewer is often trying to assess your approach to solving problems in either a hypothetical or a past real world scenario.
Unlike technical questions, the goal here isn’t to grade you on a “right” or “wrong” answer, but to get a sense of the process you use to deal with problems without immediate solutions. The most important thing to remember when answering this sort of question is to be clear and comprehensive in your description of your approach.
For example, if an interviewer asks you to tell them about a time when you had to implement a database from scratch, start by talking about developing a logical model of the database. Then move on to the process of transforming the logical model into a conceptual model, charting relationships through an entity relationship diagram. Finally, tell the interviewer how you were able to take that conceptual model to production as a physical model.
At each step, outline your thought process, possible alternatives that you decided against (and why), and the outcomes of that process. Bonus points if you can notice places in the process where you could improve in the future.
As with any technical role, you will eventually come to a point where you are expected to demonstrate proficiency in your field through coding exercises. This will be especially true during the technical and onsite portions of the interview. For database design, that means that you’ll be expected to know your way around SQL, especially Data Definition Language.
An interviewer may also throw some querying questions your way to make sure that you’re familiar enough with SQL to understand the needs of the people who will eventually be using the database(s) you design.
In the database design interview, you may be asked more specialized questions that have to do with systems design and be expected to respond with an answer that takes the interviewer through the steps you’d follow from the conceptual level of the problem to its physical solution.
A system design question that you might encounter at a company like Dropbox is:
Click Data Schema
How would you create a schema to represent client click data on the web?
Here’s a hint:
These types of questions are more architecture based and are generally given to test experience within developing databases, setting up architectures, and in this case, representing client side tracking in the form of clicks.
What exactly does click data on the web mean? Any form of button clicks, scrolls, or action at all as an interaction with the client interface, in this case desktop, would be somehow represented into a schema form for the end user to query. This does not include client views however.
A simple but effective design schema would be to first represent each action with a specific label. In this case assigning each click event a name or label describing its specific action.
Try your hand at this real interview question on Interview Query.
Now, take a look at this database design interview question from Amazon:
Modifying a billion rows
Let's say you have a table with a billion rows.
How would you add a column inserting data from the original source without affecting the user experience?
Here’s a hint:
Before jumping into the question we should remember to clarify a few details that we can potentially get out of the interviewer. It helps to ask questions to understand and show that you can think holistically about the problem. Rushing too fast into a solution is a red flag for many interviewers.
Given the problem statement, let's clarify an important question.
1. What is the potential impact of downtime?
We always have to remember to get more context out of the question. In this scenario when we are talking about a table with a billion rows, this could be a table that is powering an entire company in which downtime would affect thousands of dollars in sales or could just be an offline analytics events table that would cause impact to internal employees.
It's crucial to then mention to the interviewer and assess, what is the potential effects of downtime for seconds, minutes, and hours. Figuring out the impact is pertinent then to determining our strategy going forward.
Be prepared for your next interview by practicing this question and others like it on Interview Query.
Some other questions that you might encounter during the database design interview process are:
- Given 2 tables of size N with the same primary key, how large would the resulting dataset be after a full join, left-join, right-join, and inner-join?
- How would you architect a distributed solution to convert a billion JPEG images to PNG?
- Write an ETL taking data from an API that performs X transformations and results in CSV files.
- Create an ETL solution for sample date data.
- Build out the architecture to go from a stream of user-level event data on their devices to time-aggregatable metrics in a dashboard.
- Explain how you would build a system to track changes in a database.
Database design is a vital part of our data-driven world and opportunities in that field will only continue to grow as we move into the future. Whatever your experience in database design, being as prepared as possible for your next interview can only help your prospects as a potential hire. On Interview Query, you can practice on real interview questions from companies like Microsoft, Amazon, Google, and Facebook to keep yourself as sharp as possible for that next great opportunity.
Be prepared for your next interview. Try Interview Query today.