Top 20 Data Science Take-home Challenges

Top 20 Data Science Take-home Challenges

Overview

Data science take-home challenges are a common type of assessment used in data science interviews, and they sometimes take the place of the technical screen.

Take-home challenges are essentially mini data science projects. You have a business case study problem and a dataset, and then you must perform analysis, build a model, write working code, and make product recommendations.

These challenges take 3 to 8 hours to complete, and they’re used to screen candidates for open data science and machine learning roles. One of the biggest mistakes that candidates make is failing to understand expectations and spending too much time on a solution that doesn’t meet the interviewer’s criteria.

To help you prepare, we’ve compiled the top data science take-home challenges in these categories:

Tips for Passing Data Science Take-Homes

A data science take-home isn’t just an opportunity to display strong technical skills. These challenges provide a chance to show how well you gather input, communicate your findings, and approach solving technical challenges on the job.

If you’re looking for help, you can follow these tips for passing data science take-homes:

Understand expectations

It’s tempting to jump into a take-home challenge without correctly understanding expectations. However, a short email to the recruiter can provide direction and ensure you head down the right track. In your email, include the following:

  • Your timeline for completing the assignment
  • Ask for any general guidelines on grading
  • Ask for feedback after submitting your assignment.

State your assumptions and limitations

What if you only use a naive imputation model to fill in missing values instead of an advanced technique? State it. Write it in a comment. Do something where they understand your limitations to the amount of time you spend on the assignment.

Write up everything that you think needs to be known to your grader. Hiring managers forgot how long it took to write code and build models. They’re managers, and typically they don’t write code.

Do the modeling basics

Here’s a general checklist that will probably take you at least three hours:

  • Data cleaning
  • Minimal feature selection
  • Impute missing values
  • Create a classification pipeline
  • Try training with a couple of sci-kit learn classifiers
  • Tune hyperparameters with grid-search

Your implementation will reach the general minimal baseline of what they’re expecting. Dependent on how long you work on feature selection, it could go plus or minus an extra two to three hours.

Make your take-home challenge readable

Take-homes are an opportunity to show how you communicate technical concepts. Use a framework for organizing your ideas, like the Cookiecutter Data Science Framework. This framework will make your analysis easier to follow, enable interviewers to learn about your process and domain knowledge, and feel more confident in your conclusions.

Although formatting your work will take more time, it’ll show that you can communicate data analysis succinctly and make your work accessible.

Write tests and comments

Readability is as important as the efficiency of your code, and if you write nice comment blocks on each function, it will help communicate how your code should function and why you re-factored it the way you did. Follow the general Python conventions to make sure you’re solid.

Summarize your process in 500 words or less

The reviewer will only spend 10 minutes or less reviewing your challenge, so give them a reason to dig deeper. If you can succinctly describe your work and make it easy to understand, you’ll be much more likely to pass and move on.

Data Analytics Take-Home Challenges

Analytics take-home challenges are most common in data analyst roles. These challenges commonly provide a dataset and require you to perform exploratory data analysis. In some cases, guiding questions will be asked to help direct your analysis, and often, you’ll be required to make product or business recommendations based on your research.

1. Stripe Analytics Take-Home

Stripe Logo

  • Overview: Perform EDA on a product data set from Stripe and make product recommendations.
  • Time Required: 6 hours
  • Skills Tested: EDA, analytics, product sense
  • Deliverable: Short presentation

The Stripe take-home tests product and data sense, as well as your understanding of growth marketing. For the assignment, you’re provided data on core Stripe products and user segment data and are required to create a presentation on your findings.

Some guiding questions ask about product/user segment performance, making general forecasts for performance, and any potential product issues.

2. Instacart Data Analyst Challenge

Instacart Logo

  • Overview: Perform exploratory data analysis of Instacart orders.
  • Time Required: 3 hours
  • Skills Tested: Analytics, product sense, data storytelling.
  • Deliverable: Create a deck or document that conveys your analyses about Instacart products.

In the past, Instacart provided this take-home test to data analysts.
This assignment is focused on exploratory data analysis and includes a dataset relating to Instacart orders, order location, customer ratings, and any issues reported for a set of demands.

This assignment is brief, requiring about 3 hours to complete, and the deliverable is a deck, slides, or a document that conveys your analyses of the business.

3. Masterclass Analytics Assignment

Masterclass Logo

  • Overview: Perform analysis of traffic to the Gordon Ramsey Masterclass homepage.
  • Time Required: 3.5 hours
  • Overview: Perform analysis of traffic to the Gordon Ramsey Masterclass homepage.
  • Time Required: 3.5 hours
  • Skills Tested: Analytics, growth marketing, user behavior analysis.
  • Deliverable: Compile your analyses and code in a Jupyter Notebook.

Masterclass’s analytics take-home analyzes traffic to its Gordon Ramsey course marketing page. You’re provided with 30 days of user activity data, which includes relevant information like location, event, marketing channel, and traffic source.

With your analysis, Masterclass asks for data analysts who are both “reactive and proactive” and who can pull insights about user behavior. For example, you could investigate behavior by channel, comparing paid traffic to organic social traffic. Or you could determine the effectiveness of remarketing efforts.

4. Twitter Data Analytics Assignment

Twitter Logo

  • Overview: Evaluate an A/B test and complete a dice probability case study.
  • Time Required: 6 hours
  • Skills Tested: A/B testing, probability, and statistics, EDA
  • Deliverable: A notebook describing your methodology and code.

Twitter’s data analyst take-home assignment comes in two parts and focuses primarily on statistics and A/B testing. The first question is a probability question, asking you to calculate the probability for different scenarios in the game of craps.

However, the question is a bit more challenging because, in the scenario, one of the dice is “unfair.” The second question focuses on A/B testing and tests your ability to pull analytics metrics. Specifically, you’re tasked to measure the success of a product A/B test using the data provided.

5. Amazon Take-Home Assignment

Amazon Logo

  • Overview: Complete a probability and data analytics case study regarding inventory.
  • Time Required: 6 Hours
  • Skills Tested: Statistics, Python, time-series analysis
  • Deliverable: Create a brief describing your methodology, Python code, and responses to provided questions.

This Amazon data science take-home focuses on probability and time-series analysis and tests your Python coding ability. This assignment is a case study question that provides time-series data about inventory shortages. Your goal with the work is to determine the volume of lost sales due to inventory shortages.

Machine Learning Take-Home Challenges

Machine learning take-home assignments generally fall into two categories: 1) build a model based on provided data or 2) evaluate or improve a model. These take-home tasks typically ask you to provide a Jupyter Notebook with working code; however, you’ll also need to synthesize your methodology.

Some top FAANG machine learning take-home tasks include:

6. Opendoor Machine Learning Take-Home

Opendoor Logo

  • Overview: Build a simple prediction model for housing prices and complete two Python coding problems.
  • Time Required: 3 hours
  • Skills Tested: Modeling, machine learning, Python coding
  • Deliverable: Submit a Jupyter Notebook with a working code.

This assignment is a three-part take-home; however, it’s recommended that you spend 3 hours on it.

The first part asks you to take a small real estate transaction dataset and build a simple model to predict housing prices. The key here is explaining your choices. Describe the methodology you use, model performance, and next steps.

In Parts 2 and 3, two shorter problems test applied Python programming skills.

7. Capgemini Machine Learning Task

Capgemini Logo

  • Overview: Build and evaluate models to predict national retail store sales for each store and department.
  • Time Required: 6 hours
  • Skills Tested: Modeling, statistics, Python
  • Deliverable: Prepare a 20- to 30-minute presentation for a general technical audience.

Capgemini’s machine learning challenge presents you with a dataset of retail sales. However, the sales data is very seasonal and holiday based. Like many machine learning challenges, the presentation is more important than the model, and it should include the following:

  • Presentation of insights/conclusions
  • Relevant descriptive statistics and visualizations
  • The mathematical principles behind your model
  • Model diagnostics and interpretation

8. Airbnb Algorithms Take-Home

Airbnb Logo

  • Overview: Train is a recommender model that can predict which listings a specific user is likely to book​.
  • Time Required: 72 hours
  • Skills Tested: Machine learning, recommendation engines, algorithms
  • Deliverable: Please submit one document and provide code and a write-up.

This assignment is an in-depth, three-day model-building take-home with minimal direction. For this recommendation engine problem, Airbnb suggests formulating it as a ranking problem or a top-K recommendation problem.

The key to this challenge is your model-building process. Where do you start (e.g., a baseline model)? And what are the steps you use to tune the model?

9. DoorDash Machine Learning Coding Challenge

DoorDash Logo

  • Overview: Build​ ​a​ ​machine​ ​learning​ model​ ​​for​ a ​prediction task​ and write an​​ application​ ​ to​ make​ ​predictions using​ ​that​ ​model.
  • Time Required: 5.5 hours
  • Skills Tested: Machine learning, Python, regression analysis
  • Deliverable: A short write-up explaining your model, code for the model, and code that outputs a .tsv file for the application.

This assessment is a two-part machine learning challenge. The first is a classic modeling case study where you build a model to predict total delivery duration in seconds.

DoorDash’s take-home is meant to test your model tuning and evaluation skills, define why you used the model, how you evaluated performance, and any information of note about your approach.

It would also help if you made recommendations based on your model to reduce delivery time. You must create an app that uses the model to predict each delivery in the JSON file and writes out predictions to a new ​tab-separated file.

To learn how to solve a DoorDash Analytics Case Study, see our step-by-step guide.

10. NielsenIQ Machine Learning Take-Home

NielsenIQ logo

  • Overview: Build a classification model for text data.
  • Time Required: 6 hours
  • Skills Tested: Machine learning, Python, classification, NLP
  • Deliverable: Create a presentation as well as a Jupyter Notebook with code.

This take-home challenge tests your Natural Language Processing and classification skills. You have two types of text strings split into two files. Using this data, you’ll create a classification model that accurately labels the data. You’re free to use any machine learning techniques or metrics that you would like.

SQL Take-Home Challenges

SQL coding challenges typically include a set of SQL problems, and you must write queries for a given dataset. These challenges are common for data science and analytics roles and may also assess your product sense and analytics domain knowledge.

Here are some top data science SQL take-home tasks:

11. Uber SQL Take-Home Assignment

Uber Logo

  • Overview: Given the below subset of Uber’s schema, write executable SQL queries to answer the questions below.
  • Time Required: 6 hours
  • Skills Tested: Analytics, SQL, Python, A/B testing
  • Deliverable: Write a single query for each question.

This assessment is a three-part SQL challenge that tests your applied SQL skills and ability to draw insights from data and evaluate A/B tests. The three-part challenge includes:

  • Part 1: write SQL queries to answer sample questions like calculating the difference between actual and predicted ETA.
  • Part 2: A new driver app has been developed at Uber. You must define the primary metrics for the app, as well as design a test to evaluate the redesigned app’s performance.
  • Part 3: The last step asks you to determine which factors are best at predicting whether a newly signed-up driver will start to drive and offer suggestions to operationalize those insights to help Uber.

12. NextDoor SQL Coding Take-Home

NextDoor Logo

  • Overview: Complete simple SQL queries of a given dataset, design tables for a KPI dashboard, and write queries for the dashboard.
  • Time Required: 6 hours
  • Skills Tested: SQL, database design, data engineering
  • Deliverable: Sample SQL queries and table designs.

This assessment is a classic SQL take-home in that you must develop whiteboard queries based on provided table schema.

However, there’s an additional step, which includes database design and data engineering skills. The data engineering section asks you to design tables for a KPI dashboard and, ultimately, to write queries to populate those tables.

13. McKinsey SQL & EDA Take-Home

McKinsey Logo

  • Overview: Write SQL queries for a bike-sharing program dataset and answer specific questions about the data.
  • Time Required: 6 hours
  • Skills Tested: Analytics and SQL
  • Deliverable: Create a brief presentation, including visualizations, for your analysis.

This challenge combines your SQL skills and exploratory data analysis. You’re provided a dataset of a bike-sharing program in Washington, DC.

The first part asks you questions that would require intermediate to advanced SQL queries to analyze popular routes. The second part asks you to identify imbalances in where bikes are picked up or dropped off.

In addition, a product metrics question requires you to propose top metrics to monitor the program’s health.

14. Qventus Whiteboard SQL and Storytelling Take-Home

Qventus Logo

  • Overview: Query a dataset and answer questions. Evaluate if a machine learning model has a real-world business impact.
  • Time Required: 3 hours
  • Skills Tested: SQL, analytics, business case
  • Deliverable: Work on a SQL code, a discussion of limitations, and an outline for an analytics story about the ML model.

This take-home from the AI-driven healthcare company Qventus has been given to data analysts and focuses on practical SQL coding skills and data visualization ability.

The first problem is a classic SQL case study; you’re provided with a dataset and required to answer questions like “What percentage of patient visits are still admitted or not discharged yet from the hospital?”

The second part asks you to evaluate a model developed to predict patient surges at hospitals and describe through a data story and visualizations if the model has the intended impact.

15. DraftKings Data Analyst Challenge

DraftKing Logo

  • Overview: This three-part challenge covers data sense, applied SQL knowledge, and programming knowledge.
  • Time Required: 3 hours
  • Skills Tested: SQL, Python, analytics
  • Deliverable: A short document covering a simple data analysis case study, raw SQL queries, and Python code for automation tasks.

This assessment is a direct data analyst SQL challenge. The first part asks you to analyze a data visualization and describe what you see. This question is open to interpretation.

The following steps require you to query a sports analytics database to pull metrics about soccer athletes. Finally, the last problem is a scripting challenge, and you’re required to write Python code to automate sample data analytics tasks.

Product Case Study Take-Home Challenges

Product case study take-homes typically incorporate a few skills, including data analytics, product analysis, and SQL. Most of these challenges provide a business or product case and then ask you to make recommendations about the product to improve performance, reduce costs, or increase market share.

The most popular product case study take-homes for data scientists include:

16. Airbnb Growth Take-Home

Airbnb Logo

  • Overview: Analyze the provided data and make product recommendations to help increase bookings in Rio de Janeiro.
  • Time Required: 6 hours
  • Skills Tested: Analytics, EDA, growth marketing, data visualization
  • Deliverable: S​ummarize your recommendations in response to the questions above in a Jupyter Notebook intended for the Head of Product and VP of Operations (who is not technical).

This take-home is a classic product case study. You have booking data for Rio de Janeiro, and you must define metrics for analyzing matching performance and make recommendations to help increase the number of bookings.

This take-home includes grading criteria, which can help direct your work. Assignments are judged on the following:

  • Analytical approach and clarity of visualizations
  • Your data sense and decision-making, as well as the reproducibility of the analysis
  • Strength of your recommendations
  • Your ability to communicate insights in your presentation
  • Your ability to follow directions

17. Affirm Merchant Analysis Take-Home

Affirm Logo

  • Overview: Analyze a provided dataset and recommend improving the product or business.
  • Time Required: 6 hours
  • Skills Tested: Product sense, EDA, analytics, SQL
  • Deliverable: Submit SQL code to answer questions and document discussing recommendations.

Affirm’s product take-home assignment includes a dataset related to the company’s checkout process. You perform EDA on that dataset and answer specific queries like “Calculate conversion through the funnel by day.”

However, the real challenge comes in step two; then, you must make product recommendations to improve performance and then choose one of those recommendations for experimentation. You then give specifics about how you would test the product.

18. Lyft Drive Churn Case Study

Lyft Logo

  • Overview: Analyze a Lyft dataset for driver churn and make recommendations to reduce churn.
  • Time Required: 1 hour
  • Skills Tested: Business case, pandas, R
  • Deliverable: Submit answers to provided questions, along with the applicable code.

Lyft’s take-home is short and used to thin the candidate pool in place of a technical screen.

You write responses to questions like, How would you define driver churn? How would you calculate churn based on your answer? These questions are high-level and ask you to propose technical solutions. The key here is communicating your responses concisely and clearly.

19. Grubhub Growth Marketing Take-Home

Grubhub Logo

  • Overview: Analyze the data and recommend which states Grubhub should expand.
  • Time Required: 3 hours
  • Skills Tested: Marketing analytics, business case, growth marketing
  • Deliverable: Present your recommendations and discuss any limitations or assumptions you made

This take-home challenge provides you with a bare-bones dataset, including orders, visits to Grubhub’s site, and revenue.

Because the dataset is so limited, you’ll be required to “make assumptions and list them in your response.” Ultimately, you’ll recommend which states to target for expansion.

20. City Year Client Case Study

City Year Logo

  • Overview: Use data from the presented case to make recommendations on fundraising strategy for City Year.
  • Time Required: 6 hours
  • Skills Tested: Business case, probability
  • Deliverable: Present your recommendation in a format appropriate for a non-technical executive.

This business case take-home is a probability case study. It will require you to take a simulated dataset of response rates and average donations and, using that data, determine how City Year should prioritize its fundraising strategy, e.g., corporate donors vs. individuals.

This question is based on probability, as you’ll be able to calculate which method would generate the most fundraising impact for the company.

Video: Data Science Take-Home Tips

In this Interview Query video, Jay provides an overview of how to pass data science take-home challenges. Specifically, the video offers tips for approaching a take-home, what you should include in your submission, and questions you should ask before you get started. See his data science take-home advice here:

Data Science Take-Home Challenge