Twitter is known for its news and debates and holds the title of the SMS of the world. Created in 2006 by Jack Dorsey, Noah Glass, Evan Williams, and Biz Stone, it has grown to have more than 321 million active users per month as well as 1.6 billion search inquiries each day.
Aside from being one of the biggest tech companies, Twitter also has one of the world’s largest real-time datasets. To manage such large amounts of data, Twitter has a dedicated data science and analytics teams that employ advanced analytics and machine learning tools to improve their products and features toward delivering more relevant content on their feeds.
Interested in data science at another social media company with large datasets? We recommend reading "The Pinterest Data Scientist Interview"!
What is the data science role?
The data scientist job position at Twitter is split into both data and research scientist roles. Twitter’s data science roles are tailored-specific to the teams they are assigned to. Each Twitter’s data science role is also different from one another. Data scientist job roles at Twitter depend heavily on the teams they’re assigned to in specific features or services, and the role may span from analytics-based roles to model design and building heavy machine learning systems.
Like most big tech companies, Twitter prefers to hire only skilled individuals with a minimum of 2+ years (5+ years for senior data scientist) with some experience in data infrastructure or backend systems. This means having an engineering background or understanding data systems is helpful unless the position is analytics specific.
Other basic qualifications include:
- Bachelor’s, Master’s or PhD degree in Computer Science, Statistics, Math, Engineering, or other quantitative disciplines.
- Experience working with/analyzing large data sets and Map Reduce architectures like Hadoop and other open-source data mining and machine learning projects.
- Extensive experience using numerical programming languages like Python, SQL, R, Sparks, or Scalding for writing complex data flow.
- Strong understanding of one or more of these object-oriented programming languages Scala, C++, or Java.
- Proficiency in the use of Tableau, or Zeppelin for analysis, modelling, and data visualization.
- Experience applying advanced statistical techniques to model user behavior, identify causal impact and attribution, build and benchmark metrics.
What are the types of data scientists at Twitter?
Twitter has a data science and analytics department with research scientists and data scientists working across a wide range of teams. Whether it’s in the scaled enforcement heuristics team, consumer product team, or the home and explore team, data scientists in these teams use the latest and most advanced analytics tools and machine learning models to provide business impact recommendations and improve products. Depending on the teams, the job role may include the following:
- Create sophisticated statistical models that learn and scale to streaming data.
- Create and interpret sophisticated SQL inquiries for standard and impromptu data mining functions.
- Interpret and influence crowdsourcing and human calculation procedures for data labelling.
- Partner closely with product and engineering teams to create and assess data-driven product roadmaps
The Interview Process
The Twitter data science interview is very standardized. Generally, the interview process starts with a recruiting phone call screening that is resume-based. After this is a short technical interview with a hiring manager, and then a technical screen with a data scientist at Twitter. Finally, the last interview will consist of an on-site interview of 5 to 6 interviewers.
The initial phone screen should last anywhere from 10 to 30 minutes. You’ll be asked a lot of questions ranging from technical skills to past experience as well as your knowledge about Twitter. The recruiter will also answer questions and explain how the data science teams function at Twitter while assessing if your current experience is a good role for Twitter’s team.
After the initial phone interview, the next round is a technical screen with a data scientist. Questions in this interview can involve machine learning theory, product intuition with a focus on experimentation, and SQL or Python-based coding. Make sure to study how the Twitter product works and think about questions related to how to drive results out of experiment-based testing.
Examples of tech-screen questions:
- What features would you use to build a recommendation algorithm for Twitter users?
- Let’s say we want to roll out a new push notification system to see if we can retain more users. How would we go about doing this?
Twitter Onsite Interview
The onsite interview process involves one-on-one interviews with 5 to 6 people (usually data scientists and data engineers from Twitter) lasting 45 minutes each. This interview will require whiteboard coding as well as algorithm questions that may range from machine learning to statistics/probability and product based questions.
- Statistics and probability interview (mainly case study)
- Machine learning and experimental modeling systems interview
- Product intuition (case study)
- Data structures and/or system design interview
- Interview with a data scientist over lunch
- Behavioral interview revolving mainly around previous experience and culture fit
The on-site interview is a combination of a wide range of technical concepts. Study experimental and A/B testing design questions, SQL, machine learning questions, and product type questions.
Let's say we want to build a naive recommender. We're given two tables, one table called `friends` with a user_id and friend_id columns representing each user's friends, and another table called `page_likes` with a user_id and a page_id representing the page each user liked.
Write an SQL query to create a metric to recommend pages for each user based on recommendations from their friends liked pages.
Note: It shouldn't recommend pages that the user already likes.
Sample Twitter Data Science Interview Questions
- What would you change in Twitter app? How would you test if the proposed change is effective or not?
- Design a system to find top ten twitter hashtags in the most recent 1 min, 10 min, 1 hr.
- How would you measure user engagement given all of Twitter’s analytics and tracking data?
- Write a query in SQL to
- Given a two-column file with user codes and counts, retrieve the top-k users based on a score that is a function of the number of times they appear on the file and these counts.
- Given a list of all followers in format: 123, 345;234, 678;345, 123;…where the first column contains the ID of the follower, and the second one is the ID of who’s followed, find all mutual follows(pair 123, 345 in the example above). Do the same in the case, when this list does not fit into the memory.
- If you got the job at Twitter and got access to all of its data what kind of data analysis would you like to perform?
- How can you illustrate a tree-based system with a SQL query?
Interested in more Twitter interview questions with the solutions? Check us out at Interview Query.