The Facebook data science interview consists of multiple technical and business case questions. They are heavily focused on applying technical knowledge to business case scenarios. Facebook data scientists are heavily expected to work cross functionally and explore, analyze and aggregate large data sets to provide actionable information.
Facebook Interview Questions
The Facebook data science interview questions mainly consist of four parts, product and business senses, technical data analysis (SQL, pandas), statistics and probability, and lastly modeling knowledge and understanding of applying data.
The technical screen will always consist of one product questions and one data analysis question. Be sure to prepare for both in order to pass onto the onsite.
The onsite interview will consist of example interview questions like these:
Product and Business Sense
- How would you create a process to identify fake news postings on Facebook? Define a metric.
- Facebook sees that likes are up 10% year over year, why could this be?
- How can Facebook figure out when users falsify their attended schools?
- If 70% of Facebook users on iOS use Instagram, but only 35% of Facebook users on Android use Instagram, how would you investigate the discrepancy?
- Write a query to map nicknames (Pete, Andy, Nick, Rob, etc) to real names.
- Write a query to produce a histogram of user comments.
Statistics and Probability
- Let's say you're playing a dice game. You have 2 die. What's the probability of rolling at least one 3?
- What do you think the distribution of time spent per day on Facebook looks like?
- How would you design a classifier to send email notifications on photo posts?
Example Facebook Product Interview Question and Solution
Let's say you're working on Facebook Groups.
A product manager decides to add threading to comments on group posts. We see comments per user increase by 10% but posts go down 2%. Why would that be?
Additionally, what metrics would prove your hypotheses?
When comments are added to threads, we can theorize the following changes happen within Group postings:
- Ideas become bucketized and structured, so it becomes easier to search, navigate, and respond to different users with threads.
- Responders are forced to stay on the thread as opposed to posting a new comment or post. This can create new dynamics within user workflows and notifications with only users within the specific thread being notified as opposed to the entire comment group and original poster.
- Diving more into the activity that occurs within the notifications part the change. We can theorize that threading pushes fewer notifications to the poster (and more to the top-level commenter). This would mean fewer push notifs that bring people back to their device and therefore less posting in general. More discussion would happen within the threads which could cause the notification system to increase comments.
- The order in which your comment appeared within the post becomes irrelevant so there's no barrier to posting over shortened periods of time. This allows encourages deeper discussions as the time barrier disappears.
- Late readers may find their answers already if the same question has been previously asked, preventing duplicate posts.
- Responding to specific comments becomes possible as opposed to regular comments which are first in first out.
You can see that all of these workflows drive an increase of comments within the same posts, but prevent creation of new posts.
We can measure this through looking at a before and after analysis or an island test (where one encapsulated community gets the threaded comments and another does not).
If we look at metrics like the number of new posts per group member and the number of comments per post in both a before and after and island test, we can see if the effects were significant enough to induce the change.