Here's a joke: what do you call a data analyst that lives in California? A data scientist.
While maybe the unsexy data side-kick, the role of the data analyst has been growing steadily each year and taking on more and more of the responsibilities of a data scientist. A data analyst's typical responsibilities include assigning numerical values to crucial business functions to access and compare their performance over time.
In essence, a data analyst will analyze, visualize, and present data. Each company has their own databases that a data analyst will query from, as well as dash-boarding tools like Tableau and Looker to present and scale findings for mass consumption by different members of the organization.
How do these responsibilities translate to the interview? Let's take a look.
What shows up on the data analyst interview?
Data analyst roles are in high demand. A study by IBM highlighted that there would be more than 2.7 million employment opportunities for professionals with data skills by the end of 2020, with around 40% of them requiring a higher degree (Ph.D. or Master's degree). The average annual salary of entry-level data analyst is in the $75,000 range, while senior data analysts receive salaries exceeding $140,000.
At Interview Query, we analyzed the data from over 10K data science and data analyst interviews try to understand exactly how the data analyst interview skillsets and requirements have changed over the past few years.
Here's what we found from data analyst interviews aggregated across 450+ different tech companies.
As you can see, the data analyst interview has a huge emphasis on SQL. In fact we've found that for most data analyst roles, a SQL or data analysis / processing type question was asked 85% of the time.
You're also likely to encounter interview questions on:
- Algorithms and Python
- Product case and intuition
- Statistics and A/B testing
- Machine Learning
Here's how these question types are translated into roles and responsibilities.
Example Responsibilities and Requirements
A data analyst must know how to use data to enable a company to make strategic and critical decisions. This is done not only through sound statistical analysis but also presentation and communication to external stakeholders.
Experienced data analysts are required to have a combination of technical and leadership skills. Technical skills include knowledge on statistics, analytics, experimental design, SQL, R, or Python; machine learning (ML), data modeling and warehousing, and dashboard automation and reporting; data visualization tools such as Qlik or Tableau; and spreadsheet tools such as Google Sheets or Microsoft Excel.
Leadership skills help data analysts deploy strategic and well-informed decision making and assist with problem solving in a team-based environment. For instance, program managers may depend on data analysts to track the most critical factors of their projects, identify problems, and find different ways to address and solve a problem.
The typical responsibilities of a data analyst include the following:
- Data cleansing and preparation (this accounts for approximately 80% of the overall work of data analysts).
- Designing and supporting databases and data systems, including fixing bugs and finding solutions to data-related issues.
- Mining data from data sources and structuring the data in a format easily understandable by either human or machine.
- Using statistical tools to understand data sets, focusing on the trends and patterns that could be more valuable for carrying decision and predictive analytics via data visualization tools.
- Working with engineers, developers, and company leaders to seek opportunities for process/system improvements, develop policies for data governance, and suggest system modifications.
- Preparing dashboards and reports for executive leadership that convey trends, patterns, and predictions using appropriate data.
- Creating accurate documentation that allows the senior management to understand the data analysis process and duplicate/replicate it if required.
- Data engineering subtasks such as defining data requirements, collecting, labeling, inspecting, cleaning, augmenting, and moving data.
- Business analysis subtasks such as building data visualizations, dashboards for business intelligence, presenting technical work to clients or colleagues, translating statistics into actionable business insights, running A/B tests, and analyzing datasets.
In business, there are many types of data, like product data, marketing data, and operations data, and similarly, there are several types of data analysts, like product analysts, marketing analysts, and operations analysts.
What differentiates these jobs from each other is the domain knowledge relevant to each category, but ultimately, they are synonymous with the term ‘data analyst’.
Data Analyst Interview Questions and Example Solutions
Let's go over each different type of interview question that gets asked in the data analyst interview.
SQL Interview Questions
SQL is a programming language used to access and manipulate databases. Think of databases as a collection of tables, with a table being a collection of rows (called "records") with the same columns (called "fields").
A query is a request for data from a database table or combination of tables. As a data analyst, you’ll be querying data on a frequent basis, either to retrieve information for other employees or to solve more complex problems.
Specifically data analysts are required to solve SQL interview questions related to analytics and reporting. To read more, check out our SQL interview guide below.
Let's look at three example SQL questions asked by companies looking for a data analyst:
'users' table +-----------------+----------+ | columns | type | +-----------------+----------+ | id | int | | name | varchar | | neighborhood_id | int | | created_at | datetime | +-----------------+----------+ 'neighborhoods' table +---------+---------+ | columns | type | +---------+---------+ | id | int | | name | varchar | | city_id | int | +---------+---------+
We're given two tables, a users table with demographic information and the neighborhood they live in and a neighborhoods table.
Write a query that returns all of the neighborhoods that have 0 users.
Here's a hint:
Our predicament is to find all the neighborhoods without users. In a sense we need all the neighborhoods that do not have a singular user living in them. This means we have to introduce a concept of existence of a column in one table, but not in the other.
Repeat Job Postings
'job_postings' table +-------------+----------+ | column | type | +-------------+----------+ | id | integer | | job_id | integer | | user_id | integer | | date_posted | datetime | +-------------+----------+
Given a table of job postings, write a query to breakdown the number of users that have posted their jobs once versus the number of users that have posted at least one job multiple times.
Here's a hint:
This question is kind of complicated so let's break it down into multiple steps. First let's visualize what the output would look like.
We want the value of two different metrics, the number of users that have posted their jobs once versus the number of users that have posted at least one job multiple times. What does that mean exactly?
Well if a user has 5 jobs but only posted them once, then they are part of the first statement. But if they have a 5 jobs and posted a total of 7 times, that means that they had to at least posted one job multiple times.
'users' table +------------+----------+ | columns | type | +------------+----------+ | id | int | | name | varchar | | created_at | datetime | +------------+----------+
Given a users table, write a query to get the cumulative number of new users added by day, with the total reset every month.
Here's a hint:
This question first seems like it could be solved by just running a COUNT(*) and grouping by date. Or maybe it's just a regular cumulative distribution function?
But we have to notice that we are actually grouping by a specific interval of month and date. And that when the next month comes around, we want to the reset the count of the number of users.
This question is a tough one. See if you can figure it out on our built-in SQL editor.
Python & Algorithms
Python is a versatile programming language that is popular among data analysts and data scientists. In Python, there are various libraries that one can use to perform data analyses like NumPy, Pandas, and Sci-kit Learn.
If you want to learn how to use Python and these libraries from scratch, there are several resources that you can leverage online, like DataCamp, Udemy, Coursera, and Udacity.
For the interview however, we would highly recommend solving problems on Interview Query that are specific to data analyst interviews.
For example, while data analysts aren't expected to know Python and R to the degree that software engineers are, they will still have to demonstrate proficiency in areas around utilizing Python and R for data analysis.
Here's an example Python interview question:
Write a function that can take a string and return a list of bigrams.
sentence = """ Have free hours and love children? Drive kids to school, soccer practice and other activities. """ output = [('have', 'free'), ('free', 'hours'), ('hours', 'and'), ('and', 'love'), ('love', 'children?'), ('children?', 'drive'), ('drive', 'kids'), ('kids', 'to'), ('to', 'school,'), ('school,', 'soccer'), ('soccer', 'practice'), ('practice', 'and'), ('and', 'other'), ('other', 'activities.')]
Here's a hint:
To separate the sentence into bigrams, the first thing we need to do is split the sentence into individual words.
Product Intuition and Business Case
Product intuition and business case questions come up often in data analyst interviews when the role shifts to product analytics. Many times data analyst will transition into a product analyst role where they're required to take on more scope around the product instead of analyzing data all day.
Here's an example product intuition interview question:
Let's say you work for a social media company that has just done a launch in a new city. Looking at weekly metrics, you see a slow decrease in the average number of comments per user from January to March in this city.
The company has been consistently growing new users in the city from January to March.
What are some reasons on why the average number of comments per user would be decreasing and what metrics would you look into?
Here's a hint:
Let's model out an example scenario to help us see the data.
Jan: 10000 users, 30000 comments, 3 comments/user
Feb: 20000 users, 50000 comments, 2.5 comments/user
Mar: 30000 users, 60000 comments, 2 comments/user
We're given information that total user count is increasing linearally which means that the decreasing comments/user is not an effect of a declining user base creating a loss of network effects on the platform.
What else can we hypothesize then?
Read more about the product specific interview here:
Statistics & A/B Testing
Statistics and A/B testing are common questions during the interview process, intended to test a data analyst's fundamentals around analyzing data.
For example, it's standard that each data analyst should know what a P-value is, what confidence intervals are, and how to read and analyze an A/B test when implemented by engineers or data scientists.
Here's an example statistics question:
2X - Y
Given uniform distributions X and Y and the mean 0 and standard deviation 1 for both, what’s the probability of 2X > Y?
Here's a hint:
Given that X and Y both have a mean of 0 and a standard deviation of 1, what does that indicate for the distributions of X and Y?
In case you don’t already know, Excel is a spreadsheet program that allows you to enter data into the rows and columns of a sheet. You should know how to use basic functions like SUMIF, IF statements, and COUNT to name a few. Additionally, you should know how to create graphs and pivot tables. In practice, because companies value collaboration, you may be required to use Google Sheets, which is essentially the same thing as Excel except shared more easily between different users.
How to Learn Excel
To be honest, Excel is something you can learn through experience. If you don’t have the opportunity to learn by doing, there are some amazing YouTube videos (that are also free) that you can use to learn Excel! See below.
Tableau is a data visualization tool that’s widely used across all industries. Typically, companies use Tableau to create dashboards, which are tools that display live and real-time metrics for businesses to look at. For example, a marketing team might create a dashboard that displays daily conversions, daily marketing spend, and daily website traffic.
How to Learn Tableau
- If you want to learn how to use Tableau, they have a series of training videos that you can go through on your own here. They also provide live training and eLearning experiences.
- Alternatively, I purchased a course years ago on Data Science Essentials, and in this course, it walks you through downloading and using Tableau. If this is something that you’d be interested in, you can check out the course here.
Other Data Analyst Interview Questions
● How will you compare the performance of two search engines?
● Describe any data cleansing techniques that you've applied.
● Name the best Google product that you've used. What would you change about it, and how would you improve it?
● Describe the different ways you measure customer check out rates.
● What is the use of regularization? Tell us the difference between L1 and L2 regularization.
● What is a random forest? What do you know about Naïve Bayes classifiers?
● What are the different ways of handling NULLs when querying a data set?
● How would you determine if a given coin is biased?
● What do ACF and PACF mean?
● What is the significance of dimensionality reduction?
● The number of likes and time spent on a platform increases, but the total number of users is decreasing. Tell us the root cause behind it.
● Tell us about the most challenging project you've been a part of and describe it from the beginning to the end.
Behavioral Interview Questions for Data Analysts
● Talk about a time where you had to make a decision in a lot of uncertainty.
● How do you set goals and achieve them? Give us an example.
● Describe a time when you solved a conflict at work.
● Give an example of a situation when you have shown effectiveness, empathy, humbleness, and adaptability.
● Give us an example of when you failed.
● Talk about an occasion when you used logic to solve a problem.
● What do you do if you disagree with your senior?
Data Analyst Take-Home Assignment
Many times in a data analyst interview, candidates will receive a take-home challenge or assignment to complete to advance onto the onsite round. This process is generally testing your overarching understanding of data analysis and communication in a written format.
Here's how the process usually goes.
Each analysis starts with a problem or a task. The level of difficulty of these tasks can differ greatly. An example of a simple task would be if you were asked to write a query to provide a statistic, like yesterday’s sales in dollars. An example of a more difficult task is when the answer isn’t clear and you’re asked to explore the data, e.g. if you were asked to figure out why last month’s sales performed much worse than other months.
Explore and Query
Once you receive a problem, you’ll usually write a query or a number of queries to explore and gather the information that you need to solve the problem. This means that you’ll probably need to know SQL or Python (or both) to gather the information you need.
Continuing with the previous example, if you were asked to figure out why last month’s sales performed much worse than other months, you might query the average customer review rating last month to see if there was a problem with the product, or you might query last month’s marketing spending compared to other months to see if there was a significant cut in marketing spending.
The next step is to gather your insights. Sometimes, gathering your insights means copying and pasting your insights into an Excel sheet. Other times, it means saving the queries that you used to find the information that you need for the next step.
Once you gather your insights, you may be required to visualize your findings. Sometimes, it’ll be as simple as making a bar graph in Excel. Other times, it means creating an extensive dashboard to be used by C-suite executives. The skills required in this step depends both on the company and the project. It includes but is not limited to: Powerpoint, Excel, Tableau, Matplotlib, etc.
Communicate Your Findings
Lastly, you’ll be required to communicate your results, whether it be through a slide deck with several static graphs or a dashboard with several KPI metrics. Similar to the STAR method for answering behavioral questions, you would walk through the problem, the task, the approach you took, and the end results(s).
Data Analyst Roles in Tech vs Non-Tech
You may find yourself with new sets of tasks when moving from a regular company to a FAANG company. But which company is the best for you? The answer lies in finding the major differences between these types of companies — including the required experience, the nature of work, and the technical skills — which add to a more detailed understanding of what the role entails.
FAANG is a blanket term that represents the largest and the most dominant companies in the information technology (IT) sector: Facebook, Amazon, Apple, Netflix, and Google. These tech giants differ from other companies in four significant aspects: efficiency, processes, responsibilities, and career trajectory.
Big Tech companies have thousands of employees, all of whom have their own set of unique tasks. Work output is measured accurately and employees are placed in a hierarchy. In FAANG companies, work-life is balanced — tasks are pre-defined, employee success is appreciated, and they report to one boss. Compared to a regular company's more fluid environment, this role is simpler to manage and understand.
In a FAANG company, a transition from a previous position to a data analyst's role will be seamless. Hackathons and bootcamps are some of the most common resources that outfit new employees with the skills required for their roles.
The responsibility of a data analyst will revolve around building dashboards and analytics. Whether it is analyzing a specific product's success or evaluating company performance, the data analyst job will be pretty straightforward.
As mentioned previously, it isn't easy to climb the career ladder at FAANG companies. Nevertheless, earning is much easier as an individual contractor (IC). The role generally involves optimizing and developing products. Obviously, the career trajectory is different than one at a regular company, and reaching the topmost position (e.g., Principal Data Analyst) can take years of commitment.
For example, a person at Amazon can transition from Business Analyst to Business Intelligence (BI) Engineer to Data Analyst to Research Scientist. The pay scale differs with each subsequent role. Each role has I, II, III, and IV levels, with levels III and IV denoting Senior and Principal roles. While a professional at a regular company may transition to senior roles in fewer stages, the same is not the case with FAANG companies.
Data Analyst Role at Startups
The data analyst role in regular companies differs in two aspects: work diversity and a low head count.
A data analyst role includes a little of everything. It requires the knowledge of analytics, data visualization, data engineering, machine learning, and work that may not be defined as 'data analytics.'
You might be expected to be a part of marketing meetings or collaborate with engineers to deploy business models and develop engineering pipelines. One major advantage of working at a regular company (e.g., a startup) is the knowledge and development of several skills, which is not always the case in big companies. As a data analyst, a professional may be expected to solve problems where they have to "figure it out," resulting in lots of self-learning, ownership, independence, and self-pacing.
Low Head Count
Since the number of employees is lesser compared to FAANG companies, it is much easier to have a transition to the next position as the company grows. Nevertheless, if the company is small with fewer employees, it requires less funding, and the average salaries are lower than larger companies. Hence, it is better to start at a FAANG company, gain experience and higher compensation, then transition to a mid-level company or a startup for career advancement.
The Bottom Line
Today's organizations are increasingly dealing with data, and they are in desperate need of data analysts and scientists capable of deriving value from data. As the Internet of Things (IoT) comes into its own, those requirements will only increase. A study by the World Economic Forum suggested that 85% of companies will have adopted big data and analytics technologies by the end of 2022, with 96% planning to recruit more professionals with data analytics skills.
In case you're confused about which career to pursue in tech, it's a good idea to learn big data and analytics technologies as a starting point. Big tech companies are already hiring professionals as Chief Data Officers (CDO), implying that they're taking data management seriously. If you're someone who wants to pursue a career in data analytics today, you could be in a very lucrative position in very little time.