Prepare for and practice interview questions from Expert Institute.

Expert Institute Interview Questions

Expert Institute Interview Guides

Data Structures & Algorithms

Given two sorted lists, write a function to merge them into one sorted list.

Merge Sorted Lists

Given a list of integers, and an integer N, write a function to find all combinations that sum to the value N.

Sum to N

Given two nonempty lists of user_ids and tips, write a function to find the user that tipped the most.

Biggest Tip

Given a list of tuples featuring names and grades on a test, write a function to normalize the values of the grades to a linear scale between 0 and 1.

Normalize Grades

Stop Words Filter

Machine Learning

Encoding Categorical Features

How do we give each rejected applicant a reason why they got rejected?

Rejection Reason

Job Recommendation

Bank Fraud Model

Lasso vs Ridge

Probability

First to Six

500 Cards

Raining in Seattle

Found Item

Given three uniform(0,4) random variables, what is the probability that the median of them is greater than 3?

Median Probability

A/B Testing

Experiment Validity

Delivery Estimate Model

Button AB Test

A/B Test Power Size

Green Dot

Analytics

We're interested in determining if a data scientist who switches jobs more often ends up getting promoted to a manager role faster than a data scientist that stays at one job for longer. 

Career Jumping

What metrics would you use to determine the value of each marketing channel?

Marketing Channel Metrics

We're interested in how user activity affects user purchasing behavior. 

Activity Conversion

How would you differentiate between scrapers and real people given a person's browsing history on your site?

Scrapers or Users

Let’s say that you're in charge of an e-commerce D2C business that sells socks. What business health metrics would you care?

D2C Socks e-Commerce

When an interviewer asks a question along the lines of:

<ul>
<li>What would your current manager say about you? What constructive criticisms might he give?</li>
<li>What are your three biggest strengths and weaknesses you have identified in yourself?</li>
</ul>

How would you respond?

When asked about your strengths in an interview, what is an effective way to respond?

When asked about your strengths in an interview, what is an effective way to respond?

Your Strengths and Weaknesses I

Which of the following is an acceptable strategy when discussing weaknesses in an interview?

Which of the following is an acceptable strategy when discussing weaknesses in an interview?

Your Strengths and Weaknesses II

What do you tell an interviewer when they ask you what your strengths and weaknesses are?

Your Strengths and Weaknesses

Brainteasers

Describe a data project you worked on. What were some of the challenges you faced?

Describing a data project and its challenges

Hurdles In Data Projects

How would you explain what a p-value is to someone who is not technical?

What does a p-value in a statistical test represent?

What does a p-value in a statistical test represent?

P-value to a Layman I

In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?

In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?

P-value to a Layman II

P-value to a Layman

Statistics

What are the assumptions of linear regression?

Which assumption of the residuals of the standard linear regression model can not be overcome by increasing the sample size?

Regression assumptions

What are the assumptions of linear regression?

Assumptions of Linear Regression

How would you handle the data preparation for building a machine learning model using imbalanced data?

Addressing imbalanced data in machine learning through carefully prepared techniques.

Data Preparation for Imbalanced Data

Let’s say we’re comparing two machine learning algorithms. In which case would you use a bagging algorithm versus a boosting algorithm? 

Give an example of the tradeoffs between the two.

In machine learning, when would you use a bagging algorithm over a boosting algorithm?

In machine learning, when would you use a bagging algorithm over a boosting algorithm?

Bagging vs. Boosting

Bagging vs Boosting

Talk about a time when you had trouble communicating with stakeholders. How were you able to overcome it?

Strategically resolving misaligned expectations with stakeholders for a successful project outcome

Stakeholder Communication

Explain the difference between the XGBoost and random forest algorithms and give an example where you would use one over the other.

Why might a data scientist choose to use XGBoost instead of Random Forest for a particular machine learning task?

Why might a data scientist choose to use XGBoost instead of Random Forest for a particular machine learning task?

XGBoost vs Random Forest: Choice

Xgboost vs Random Forest

How does random forest generate the forest? Additionally, why would we use it over other algorithms such as logistic regression?

What happens when you average the output on multiple decision trees?

What happens when you average the output on multiple decision trees?

Average Trees

Random Forest Explanation

Let’s say that you work at a B2B SAAS company that’s interested in testing the pricing of different levels of subscriptions.

Your project manager comes to you and asks you to run a two-week-long A/B test to test an increase in pricing.

How would you approach designing this test? How would you determine whether the increase in pricing is a good business decision?

Testing Price Increase

Create a class named <code>LRUCache</code> that implements the behavior expected of a Least Recently Used (LRU) Cache.

The <code>LRUCache</code> class implements the following methods:

<ul>
<li><code>__init__(self, capacity)</code> - initializes the object and takes in a capacity parameter specifying the maximum capacity of the cache.</li>
<li><code>get_if_exists(self, key) -&gt; Any|None</code> - gets a value based on a key. If the key exists, returns the associated value. If the key has expired or did not exist in the first place, returns <code>None</code>.</li>
<li><code>put(self, key, value)</code> - places a value inside the cache. In the case wherein the cache is full, invalidates the least recently used element. When two keys collide, the older value should be invalidated in place of the new one.</li>
</ul>

Example:

<pre tabindex="0" class="chroma"><code>cache = LRUCache(2)
cache.put(&#34;sample&#34;, 55)
cache.get_if_exists(&#34;sample&#34;) # returns 55
cache.get_if_exists(&#34;key&#34;) # returns None
cache.put(&#34;hello&#34;, 10) 
cache.put(&#34;sample&#34;, 9)
cache.put(&#34;world&#34;, 5)
cache.get_if_exists(&#34;hello&#34;) # returns None
</code></pre>

Notes:

<ul>
<li>It is assured that the value in the <code>put</code> method will never receive a <code>None</code> value. Moreover, it is assured that the capacity will always be &gt; 0.</li>

<li>All operations must be O(1).</li>

<li>Aside from Python’s standard dictionary implementation (<code>{}</code>), you may not use any other built-in data structures.</li>
</ul>

LRU Cache 1

A team wants to A/B test multiple different changes through a sign-up funnel.

For example, on a page, a button is currently red and at the top of the page. They want to see if changing a button from red to blue and/or from the top of the page to the bottom of the page will increase click-through.

How would you set up this test?

Your team wants to run an AB test on the color (red or blue) and position (top/bottom of the page) of a button that links to a promotion.

How many different variants of the test should you run to see the effect that a red button has on the bottom of the page, if currently the button is blue and on the top of the page.

Your team wants to run an AB test on the color (red or blue) and position (top/bottom of the page) of a button that links to a promotion.

How many different variants of the test should you run to see the effect that a red button has on the bottom of the page, if currently the button is blue and on the top of the page.

A/B Test Variants

Let’s say we want to launch a re-design of a landing page to improve the click-through rate. We can do this by implementing an AB test.
Given that we launch an AB test, how would you infer if the results of the click-through rate were statistically significant or not?

How can you accurately conclude if the results of an A/B test, conducted to evaluate the effectiveness of a landing page redesign, are statistically significant?

How can you accurately conclude if the results of an A/B test, conducted to evaluate the effectiveness of a landing page redesign, are statistically significant?

Statistically Significant Test

Precisely ascertain whether the outcomes of an A/B test, executed to assess the impact of a landing page redesign, exhibit statistical significance.

Given an array and a target integer, write a function <code>sum_pair_indices</code> that returns the indices of two integers in the array that add up to the target integer. If not found, just return an empty list.

Note: Can you do it on \(O(n)\) time?

Note: Even though there could be many solutions, only one needs to be returned.

Example 1:

Input:

<pre tabindex="0" class="chroma"><code>array = [1 2 3 4] 
target = 5 
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def sum_pair_indices(array, target) -&gt; [0 3] or [1 2]
</code></pre>

Example 2:

Input:

<pre tabindex="0" class="chroma"><code>array = [3]
target = 6 
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>Do NOT return [0 0] as you can&#39;t use an index twice.
</code></pre>

Given an array and a target integer, write a function that returns the indices of two integers in the array that add up to the target integer.

Target Indices

Let’s say that your company is running a standard control and variant AB test on a feature to increase conversion rates on the landing page. The PM checks the results and finds a .04 p-value.

How would you assess the validity of the result?

Let’s say that you work at a bank that wants to build a model to detect fraud on the platform.

The bank wants to implement a text messaging service in addition that will text customers when the model detects a fraudulent transaction in order for the customer to approve or deny the transaction with a text response.

How would we build this model?

Let’s say that you work at a bank that wants to build a model to detect fraud on the platform.

The bank wants to implement a text messaging service that will text customers when the model detects a fraudulent transaction in order for the customer to approve or deny the transaction with a text response.

Which statement is true?

Let's say that you work at a bank that wants to build a model to detect fraud on the platform.

The bank wants to implement a text messaging service that will text customers when the model detects a fraudulent transaction in order for the customer to approve or deny the transaction with a text response.

Which statement is true?

Regularization and cross-validation are two common techniques used to improve the performance of machine learning algorithms.

When should you use one versus the other?

Given the choices below, which one is an example of a scenario where we should be using regularization?

Given the choices below, which one is an example of a scenario where we should be using regularization?

Regularization Example

Regularization and Validation

How would you convey insights and the methods you use to a non-technical audience?

Making data-driven insights actionable for those without technical expertise

Simple Explanations

What’s the difference between Lasso and Ridge Regression?

Under what circumstances is it recommended to use Lasso regression over Ridge regression?

Under what circumstances is it recommended to use Lasso regression over Ridge regression?

<h1>Binary Tree Validation</h1>

You are given the root of a binary tree. You need to determine if it is a valid binary search tree (BST).

A valid BST is defined as follows:

<ul>
<li>The left subtree of a node contains only nodes with values less than or equal to the node’s value.</li>
<li>The right subtree of a node contains only nodes with values greater than or equal to the node’s value.</li>
<li>Both the left and right subtrees must also be binary search trees.</li>
</ul>

Given the function <code>def is_valid_bst(root: Node) -&gt; bool:</code>, return True if the binary tree is a valid BST. Otherwise, return False.

<h3>Example:</h3>

Input:

<img src="https://d2qpirhrfplx04.cloudfront.net/9664817c-3e12-4a40-9236-8382b13f55a0.png" alt="image"/>Converted Binary Tree.png

Output:

<pre tabindex="0" class="chroma"><code>def is_valid_bst(Node(3)) -&gt; True
</code></pre>

Given the root node, verify if a binary search tree is valid or not.

Binary Tree Validation

Let’s say you are working on Google Docs. A product manager comes to you and asks how the product is doing. 

What are the top five metrics that you would start tracking to understand the health of Google Docs?

Docs Metrics

Product Sense & Metrics

What’s the relationship between PCA and K-means clustering?

What does the variable “k” in k-means clustering refer to?

What does the variable "k" in k-means clustering refer to?

Input of K-means

PCA and K-Means

You’re given a string that may contain the characters <code>{</code>, <code>}</code>, <code>[</code>, <code>]</code>, <code>(</code>, and <code>)</code>.

Task: Verify that the string is balanced. A balanced string is one where every opening character, <code>{</code>, <code>[</code>, or <code>(</code>, has a corresponding closing character, <code>}</code>, <code>]</code>, or <code>)</code>.

Write a function called <code>is_balanced(string: str) -&gt; bool</code> which verifies the balance of a string.

Example:

<pre tabindex="0" class="chroma"><code>is_balanced(&#39;(())[]{}&#39;) -&gt; True
</code></pre>

<pre tabindex="0" class="chroma"><code>is_balanced(&#39;{([(){}])()}&#39;) -&gt; True
</code></pre>

<pre tabindex="0" class="chroma"><code>is_balanced(&#39;{}[]())&#39;) -&gt; False
</code></pre>

<hr/>

Write a function that tests whether a string of brackets is balanced.

The Brackets Problem

In the context of hypothesis testing, what are type I errors (type one errors) and type II errors (type two errors)? What is the difference between the two?

Bonus: Describe the probability of making each type of error mathematically.

What is the difference between type I and type II errors?

Type I and II Errors

How would you tackle multicollinearity in multiple linear regression?

Multicollinearity in Regression

Given two sorted lists, write a function to merge them into one sorted list.

Bonus: What’s the time complexity?

Example:

Input:

<pre tabindex="0" class="chroma"><code>list1 = [1,2,5]
list2 = [2,4,6]
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def merge_list(list1,list2) -&gt; [1,2,2,4,5,6]
</code></pre>

You have two sorted lists, one with m elements and one with n elements. What would be the time complexity of merging the two lists via merge sort?

You have two sorted lists, one with m elements and one with n elements. What would be the time complexity of merging the two lists via merge sort? 

Merging Lists Complexity

Let’s say you want to test the close friends feature on Instagram Stories.

How would you make a control group and test group to account for network effects?

What could be a potential risk when Facebook segments the metrics of the user interface change by market or demographic groups?

What could be a potential risk when Facebook segments the metrics of the user interface change by market or demographic groups?

Network Experiment Design I

What is the primary metric that Facebook should monitor if it decides to make the user interface of its posting feature more like Instagram's?

What is the primary metric that Facebook should monitor if it decides to make the user interface of its posting feature more like Instagram's?

Network Experiment Design II

If Facebook changes its user interface to mimic Instagram's, which adverse effect should they anticipate and monitor?

If Facebook changes its user interface to mimic Instagram's, which adverse effect should they anticipate and monitor?

Network Experiment Design III

Network Experiment Design

Let’s say you’re given all the different marketing channels along with their respective marketing costs at a company called Mode, that sells B2B analytics dashboards.

What metrics would you use to determine the value of each marketing channel?

When determining the value of each marketing channel for the company Mode, which metric is considered the key metric?

When determining the value of each marketing channel for the company Mode, which metric is considered the key metric?

Let’s say you’re analyzing an AB test that has both a test group and a control group.

<ol>
<li>How do you calculate the sample size necessary for an accurate measurement?</li>

<li>Let’s say that the sample size is similar and sufficient between the two groups. In order to measure very small differences between the two, should the power get bigger or smaller?</li>
</ol>

What does the backpropagation algorithm do in the context of neural networks? What is the informal intuition behind the algorithm? What are some drawbacks of the algorithm compared to other optimization methods?

Bonus: Formally derive the backpropagation algorithm and prove that it does what it claims to do.

Backpropagation Explanation

Challenge

<h5>Research Analyst, Expert Witness Services </h5>Direct message the job poster from Expert Institute This entry-level, full-time position requires work to be done from our Milwaukee office. Candidates must be local or willing to relocate. We cannot provide Work Visa support for this role at this time. <h5>ABOUT US </h5>Expert Institute is the nation's leading expert consulting and insights platform, helping client law firms win more cases and increase their client outcomes. Established in 2010, Expert Institute has grown significantly to support over 5,000 of the best law firms nationwide, across all areas of practice. We connect litigators with top industry experts, innovative litigation research, physician consultations, and comprehensive due diligence all delivered through our SaaS platform, Expert iQ, leveraging proprietary data and analytics, to give our client law firms a winning edge for everyone they represent. At Expert Institute, at our core, we provide exceptional tools and resources for trial attorneys. Trial attorneys who want the best outcome for their clients. Trial work is demanding and intense and so is the work we do every day at Expert Institute. We believe a winning spirit requires a relentless commitment to high performance and that we win as a team. And everyone on the team will do their best every day to make sure we win together. That's a winning mindset. If you have it, we'd love to hear from you. <h5>Job Description </h5>Research & Recruitment Analyst Expert Witness Services Are you passionate about research, communication, and the legal industry? Do you thrive in a fast-paced, client-focused environment? Join our team as a Research Analyst, where you'll play a critical role in identifying, evaluating, and connecting expert witnesses with attorneys handling high-stakes litigation. This is a unique but demanding opportunity for individuals looking to build a career in client consulting, legal research, or professional services. Research Analysts serve as the backbone of our Expert Search service, sourcing, screening, and vetting expert witness candidates to support our attorney clients. Youll engage with subject matter experts across industries, assessing their suitability for complex legal matters and providing data-driven recommendations that directly impact case outcomes. <h5>What Youll Do: </h5> <ul><li>Manage multiple cases with varying deadlines, conducting in-depth research to understand each matters unique needs. </li><li>Identify and engage with both existing contacts and new candidates to expand our network of expert professionals. </li><li>Conduct outreach via phone and email to screen potential expert witnesses and assess their qualifications. </li><li>Deliver thoroughly researched and well-vetted expert recommendations tailored to clients litigation strategies. </li><li>Build and maintain relationships with experts, providing insight into our process and ensuring timely follow?up. </li><li>Collaborate closely with internal teams to ensure seamless service delivery and an exceptional client experience. </li> </ul><h5>Qualifications </h5>What You Bring: <ul><li>A desire for a promising career (not just a job) and a willingness to put in the time to succeed. </li><li>A corporate internship or relevant experience is beneficial. </li><li>Strong communication skills, both written and verbal, with the ability to engage professionally on the phone with experts and attorneys. </li><li>A client-focused mindset prior customer service, research, or consulting experience is a huge plus. </li><li>Critical thinking and attention to detail to assess expert credentials and case requirements effectively. </li><li>A self?motivated, competitive drive, with the ability to multitask and work autonomously. </li><li>A collaborative, team-oriented approach with a commitment to excellence. </li><li>Tech-savvy and adaptable, comfortable using cloud-based solutions like Salesforce. </li><li>A bachelors degree, in a major with research as a foundation, is beneficial but not required </li> </ul> <h5>Additional Information </h5> <h5>Why Join Us? </h5><ul><li>Gain insider exposure to the legal industry and high-profile litigation. </li><li>Develop invaluable research and analytical skills applicable across industries. </li><li>Work in a dynamic, fast-paced environment where no two cases are the same. </li><li>Be part of a team that values curiosity, innovation, and professional growth. </li><li>And, in addition to best-in-class benefits, including a 401(k) match, we provide lunch in the office on Wednesday and Friday. </li> </ul>If youre looking for an engaging role that blends research, communication, and client service, wed love to hear from you! All your information will be kept confidential according to EEO guidelines. <h5>Job Details </h5> <ul><li>Seniority level: Entry level </li><li>Employment type: Full-time </li><li>Job function: Legal </li><li>Industries: Law Practice </li><li>Benefits: Medical insurance, Vision insurance, 401(k), Paid paternity leave, Paid maternity leave, Disability insurance </li> </ul> #J-18808-Ljbffr

Calculate Moving Average	SQL	Easy
Predict Customer Churn	Machine Learning	Medium
A/B Test Significance	Statistics	Medium
Optimize Query Performance	SQL	Hard
Feature Importance Analysis	Machine Learning	Medium
Clean Missing Data	Python	Easy
Neural Network Architecture	Deep Learning	Hard
Calculate Cohort Retention	SQL	Medium
Bayesian Probability	Statistics	Easy
Recommend Similar Products	Machine Learning	Hard

Expert Institute Interview Questions

Expert Institute Interview Guides

Expert Institute Interview Questions

Challenge

Expert Institute Opening Jobs

Discussion & Interview Experiences

Discussion & Interview Experiences