Prepare for and practice interview questions from Genzeon Corporation.

Genzeon Corporation Interview Questions

Genzeon Corporation Interview Guides

Data Structures & Algorithms

Given an integer N, write a function that returns all of the prime numbers up to N

Prime to N

String Mapping

Swap Variables

Implement a dynamic recursive fibonacci function.

Impossibly Iterative Fibonacci

This question requires the implementation of the Fibonacci sequence using three different methods: recursively, iteratively, and using memoization.

Implementing the Fibonacci Sequence in Three Different Methods

Business Case

Declining Applicants

How to boost presence in high-demand city areas

Incentive Scheme

Decreasing Ride Costs

Statistics

P-value to a Layman

What is the difference between the Z and t tests?

Z and t-Tests

What is the difference between Logistic and Linear Regression?

Linear vs Logistic Regression

Select the 2nd highest salary in the engineering department

2nd Highest Salary

Write a query to get the number of friends of a user that like a specific page

Liked Pages

Size of Joins

Machine Learning

Justify a Neural Network

Bias vs. Variance Tradeoff

When you should consider using Support Vector Machine rather then Deep learning models

Support Vector Machines vs Deep Learning Models

When an interviewer asks a question along the lines of:

<ul>
<li>What would your current manager say about you? What constructive criticisms might he give?</li>
<li>What are your three biggest strengths and weaknesses you have identified in yourself?</li>
</ul>

How would you respond?

When asked about your strengths in an interview, what is an effective way to respond?

When asked about your strengths in an interview, what is an effective way to respond?

Your Strengths and Weaknesses I

Which of the following is an acceptable strategy when discussing weaknesses in an interview?

Which of the following is an acceptable strategy when discussing weaknesses in an interview?

Your Strengths and Weaknesses II

What do you tell an interviewer when they ask you what your strengths and weaknesses are?

Your Strengths and Weaknesses

Brainteasers

How would you explain what a p-value is to someone who is not technical?

What does a p-value in a statistical test represent?

What does a p-value in a statistical test represent?

P-value to a Layman I

In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?

In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?

P-value to a Layman II

Given a string, write a function to determine if it is palindrome or not.

Note: A palindrome is a word/string that is read the same way forward as it is backward, e.g. <code>&#39;reviver&#39;</code>, <code>&#39;madam&#39;</code>, <code>&#39;deified&#39;</code> and <code>&#39;civic&#39;</code> are all palindromes, while <code>&#39;tree&#39;</code>, <code>&#39;music&#39;</code> and <code>&#39;person&#39;</code> are not palindromes.

Example:

Input:

<pre tabindex="0" class="chroma"><code>word1 = &#34;tree&#34;
word2 = &#34;radar&#34;
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def is_palindrome(word1) -&gt; False
def is_palindrome(word2) -&gt; True
</code></pre>

Given a string, write a function to determine if it is palindrome or not.

String Palindromes

Write a SQL query to select the 2nd highest salary in the engineering department.

Note: If more than one person shares the highest salary, the query should select the next highest salary.

Example:

Input:

<code>employees</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>first_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>last_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>department_id</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
<code>departments</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>name</code></td>
<td>VARCHAR</td>
</tr>
</tbody>
</table>
Output:

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>

Imagine you are asked to build a machine learning model to decide new loan approvals for a
financial firm. You ask the data department in the company for a subset of data to get started
working on the problem. The data includes different features about applicants such as age,
occupation, zip code, height, number of children, favorite color, etc. You decide to build
multiple machine learning models to test out different ideas before settling on the best one.

How would you explain the bias-variance tradeoff with regards to
building and choosing a model to use?

What is the difference between Logistic and Linear Regression?

When would use one instead of the other in practice?

Let’s say you work at Allstate. Allstate is running <code>N</code> online ads right now. The table <code>ads</code> contains all those ads, ranked by popularity via the <code>id</code> column (e.g., the entry with <code>id = 1</code> is the most popular, etc.).

Create a subquery or common table expression named <code>top_ads</code> containing the top 3 ads (by popularity) and return the number of rows that would result from the following operations

<ol>
<li><code>ads INNER JOIN top_ads</code></li>
<li><code>ads LEFT JOIN top_ads</code></li>
<li><code>ads RIGHT JOIN top_ads</code></li>
<li><code>ads CROSS JOIN top_ads</code></li>
</ol>

Note: Please make the <code>join_type</code> column in your output have the values <code>inner_join</code>, <code>left_join</code>, etc. for each of their respective join types

Note: Please return only one query with each number in a different row

Example:

Input:

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>name</code></td>
<td>VARCHAR</td>
</tr>
</tbody>
</table>
Output:

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>join_type</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>number_of_rows</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>

Given an integer <code>N</code>, write a function that returns a list of all of the prime numbers up to <code>N</code>.

Note: Return an empty list there are no prime numbers less than or equal to <code>N</code>.

Example:

Input:

<pre tabindex="0" class="chroma"><code>N = 3
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def prime_numbers(N) -&gt; [2,3]
</code></pre>

The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1. It is often used in algorithm examples, and is defined by the following formula: F(n) = F(n-1) + F(n-2), with F(0) = 0 and F(1) = 1.

Your task is to implement the Fibonacci algorithm in three different methods:
1. Recursively
2. Iteratively
3. Using Memoization

Example 1:

Input:

<pre tabindex="0" class="chroma"><code>n = 5
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>fibonacci(n) -&gt; 5
</code></pre>

Example 2:

Input:

<pre tabindex="0" class="chroma"><code>n = 10
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>fibonacci(n) -&gt; 55
</code></pre>

The Fibonacci sequence starts as follows: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55…

Let’s say you’re a data engineer at Fidelity Investments, and you’re running a SQL query on a cloud-based data warehouse. All cluster resources and network health metrics look normal, but the query is still taking over 10 minutes to complete.

How would you go about diagnosing and improving the performance of this query?

How would you diagnose and speed up a slow SQL query when system metrics look healthy?

Slow SQL Query

Query Optimization

You are given a dictionary with two keys <code>a</code> and <code>b</code> that hold integers as their values.

Without declaring any other variable, swap the value of <code>a</code> with the value of <code>b</code> and vice versa.

Note: Return the dictionary after editing it.

Example:

Input:

<pre tabindex="0" class="chroma"><code>numbers = {
 &#39;a&#39;:3,
 &#39;b&#39;:4
}
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def swap_values(numbers) -&gt; {&#39;a&#39;:4,&#39;b&#39;:3}
</code></pre>

Write a function <code>fib</code> which takes an integer <code>n</code> and returns the nth fibonacci number. For this question, <code>fib</code> MUST be recursively defined.

Note: <code>for</code> and <code>while</code> keywords are disabled for this question, as well as <code>functools</code>. Any workarounds around this restriction is prohibited.

Example:

Input:

<pre tabindex="0" class="chroma"><code>n = 4
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>def fib(n) -&gt; 3
# the sequence is [1, 1, 2, 3, 5, 8...]
</code></pre>

Let’s say you’re a data scientist at LinkedIn where you’re working on a product that sends qualified job candidates to companies. The team has launched a new feature that allows candidates to message hiring managers at companies directly during the interview process to get updates on their status.

Due to engineering constraints, the company can’t AB test the feature before launching it.

How would you analyze how the feature is performing?

How would you analyze how the feature is performing?

Recruiting Leads

Analytics

We want to determine if Uber Eats has a net positive value for the company Uber.

How would you measure the success of Uber Eats?

If Uber wants to evaluate the customer loyalty and repeat business on Uber Eats, which of the following metrics would be the most relevant?

If Uber wants to evaluate the customer loyalty and repeat business on Uber Eats, which of the following metrics would be the most relevant?

Uber Eats Success I

Which among the following metrics would be most indicative of the operational efficiency of Uber Eats?

Which among the following metrics would be most indicative of the operational efficiency of Uber Eats?

Uber Eats Success II

Which among the following metrics would be most critical in evaluating the overall profitability of Uber Eats for Uber?

Which among the following metrics would be most critical in evaluating the overall profitability of Uber Eats for Uber?

Uber Eats Success III

Uber Eats Success

Product Sense & Metrics

Let’s say your manager asks you to build a model with a neural network to solve a business problem.

How would you justify the complexity of building such a model and explain the predictions to non-technical stakeholders?

Which is the following domains is generally NOT a good use case for using a deep neural net?

Which is the following domains is generally NOT a good use case for using a deep neural net?

Neural Net Use Cases

What are the Z and t-tests? What are they used for? What is the difference between them? When should use one over the other?

Which statistical test, the Z or t test, is more appropriate for tests with small sample sizes?

Which statistical test, the Z or t test, is more appropriate for tests with small sample sizes?

Z and t Tests I

What happens to the t-distribution as the degrees of freedom (ν) gets large?

What happens to the t-distribution as the degrees of freedom (ν) gets large?

Z and t Tests II

Let’s say that you’re looking at the metrics of a job board. You see that the number of job postings per day has more or less remained the same the last few months but the number of applicants that have applied to jobs has steadily been decreasing.

Why would this be happening?

To improve customer experience on Uber Eats, what key parameters would you focus on improving?

Delivering an exceptional customer experience by focusing on key customer-centric parameters

Uber Eats Customer Experience

<ol>
<li>Design a database for a stand-alone fast food restaurant. </li>

<li>Based on the above database schema, write a SQL query to find the top three highest revenue items sold yesterday. </li>

<li>Write a SQL query using the database schema to find the percentage of customers that order drinks with their meal.</li>
</ol>

Fast Food Database

Data Modeling

Deep learning models are popular but have some drawbacks in that they’re expensive to train and maintain.

In some contexts, using other simpler models may make more sense. One possible alternative for classification problems are support vector machines (SVM).

<ul>
<li>When are SVMs preferable to deep learning models?</li>

<li>What are the pros and cons of using an SVM compared to deep or non-deep learning classification models (such as logistic regression)?</li>
</ul>

Given two strings, <code>string1</code> and <code>string2</code>, write a function <code>str_map</code> to determine if there exists a one-to-one correspondence (bijection) between the characters of <code>string1</code> and <code>string2</code>.

For the two strings, our correspondence must be between characters in the same position/index.

Example 1:

Input:

<pre tabindex="0" class="chroma"><code>string1 = &#39;qwe&#39;
string2 = &#39;asd&#39;

string_map(string1, string2) == True

# q = a, w = s, and e = d
</code></pre>

Example 2:

Input:

<pre tabindex="0" class="chroma"><code>string1 = &#39;donut&#39;
string2 = &#39;fatty&#39;

string_map(string1, string2) == False
# cannot map two distinct characters to two equal characters
</code></pre>

Example 3:

Input:

<pre tabindex="0" class="chroma"><code>string1 = &#39;enemy&#39;
string2 = &#39;enemy&#39;

string_map(string1, string2) == True
# there exists a one-to-one correspondence between equivalent strings
</code></pre>

Example 4:

Input:

<pre tabindex="0" class="chroma"><code>string1 = &#39;enemy&#39;
string2 = &#39;ymene&#39;

string_map(string1, string2) == False
# since our correspondence must be between characters of the same index, this case returns &#39;False&#39; as we must map e = y AND e = e
</code></pre>

Let’s say that you’re working on the Uber app. 

How would you design an incentive scheme for drivers such that they would more likely go into city areas where demand is high?

We have two concentric circles a and b, each of them having a radius <code>r_a</code>, <code>r_b</code> where <code>r_b</code> &gt; <code>r_a</code>.

The third circle c has a different radius calculation, with radius <code>r_c</code> and center point <code>center_c</code>.

Write a function <code>is_contained(r_a,r_b, r_c, center_c)</code> which returns <code>True</code> if the circle c occupies the space between circle a and b. Otherwise, return <code>False</code>.

Note: the center point of <code>a</code> and <code>b</code> is <code>(0,0)</code>

As an example, only <code>c_2</code> meets the requirement in this image as it lies between <code>a</code> and <code>b</code>.

<img src="https://interviewquery-cms-images.s3-us-west-1.amazonaws.com/1e3f82b2-7025-4de6-9309-48fa12d302e8.png" alt="image"/>

Example 1:

Input:

<pre tabindex="0" class="chroma"><code>r_a = 3
r_b = 6
r_c = 1
center_c = (4,0)
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>is_contained(r_a,r_b,r_c,center_c) -&gt; True
</code></pre>

<img src="https://interviewquery-cms-images.s3-us-west-1.amazonaws.com/e596781f-edca-43f0-9c5a-abcf17096311.png" alt="image"/>

Example: 2

Input:

<pre tabindex="0" class="chroma"><code>r_a = 3
r_b = 6
r_c = 1
center_c = (6,0)
</code></pre>

Output:

<pre tabindex="0" class="chroma"><code>is_contained(r_a,r_b,r_c,center_c) -&gt; False
</code></pre>

<img src="https://interviewquery-cms-images.s3-us-west-1.amazonaws.com/127cc272-72d1-4af8-baf0-921f189afdef.png" alt="image"/>

Write a function to determine if a circle is contained between two other circles.

Concentric Circles

Let’s say we want to build a naive recommender. We’re given two tables, one table called <code>friends</code> with a <code>user_id</code> and <code>friend_id</code> columns representing each user’s friends, and another table called <code>page_likes</code> with a <code>user_id</code> and a <code>page_id</code> representing the page each user liked.

Write an SQL query to create a metric to recommend pages for each user based on recommendations from their friend’s liked pages. 

Note: It shouldn’t recommend pages that the user already likes.

Example:

Input:

<code>friends</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>friend_id</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
<code>page_likes</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>page_id</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
Output:

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>page_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>num_friend_likes</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>

Let’s say you work at Lyft or Uber.

How would you evaluate the potential impact of decreasing fees for riders and drivers across the app for growth prospects?

An online marketplace company has introduced a new feature that allows potential buyers and sellers to conduct audio chats with each other prior to transacting.

Let’s say we have two tables that represent this data.

Example:

Input:

<code>chats</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>buyer_user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>seller_user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>call_length</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>call_connected</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
<code>marketplace_purchases</code> table

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>buyer_user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>seller_user_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>item_id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>purchase_amount</code></td>
<td>FLOAT</td>
</tr>
</tbody>
</table>

<ol>
<li>How would you measure the success of this new feature?</li>
<li>Write a query that can represent if the feature is successful or not.</li>
</ol>

How would you measure the success of an online marketplace introducing an audio chat feature given a dataset of their usage?

Calculate Moving Average	SQL	Easy
Predict Customer Churn	Machine Learning	Medium
A/B Test Significance	Statistics	Medium
Optimize Query Performance	SQL	Hard
Feature Importance Analysis	Machine Learning	Medium
Clean Missing Data	Python	Easy
Neural Network Architecture	Deep Learning	Hard
Calculate Cohort Retention	SQL	Medium
Bayesian Probability	Statistics	Easy
Recommend Similar Products	Machine Learning	Hard

Genzeon Corporation Interview Questions

Genzeon Corporation Interview Guides

Genzeon Corporation Interview Questions

Challenge

Discussion & Interview Experiences

Discussion & Interview Experiences