Prepare for and practice interview questions from Bigbear.ai.

Bigbear.ai Interview Questions

Bigbear.ai Interview Guides

Data Structures & Algorithms

Given two sorted lists, write a function to merge them into one sorted list.

Merge Sorted Lists

Find Bigrams

Write a function to create a single dataframe with complete addresses in the format of street, city, state, zip code.

Complete Addresses

Write a function to return the cumulative percentage of students that received scores within certain buckets.

Bucket Test Scores

String Shift

Probability

First to Six

500 Cards

Raining in Seattle

Found Item

Ad Raters

Machine Learning

Coefficients of Logistic Regression

Job Recommendation

RMS Error

Lasso vs Ridge

Why would one algorithm generate different success rates with the same dataset?

Same Algorithm Different Success

A/B Testing

Experiment Validity

Delivery Estimate Model

Button AB Test

A/B Test Power Size

Network Experiment Design

Statistics

P-value to a Layman

Reducing Error Margin

What are the assumptions of linear regression?

Assumptions of Linear Regression

Possibly Biased Coin

Survey Response Randomness

<p>When an interviewer asks a question along the lines of:</p>

<ul>
<li>What would your current manager say about you? What constructive criticisms might he give?</li>
<li>What are your three biggest strengths and weaknesses you have identified in yourself?</li>
</ul>

<p>How would you respond?</p>


<p>When asked about your strengths in an interview, what is an effective way to respond?</p>

When asked about your strengths in an interview, what is an effective way to respond?

Your Strengths and Weaknesses I

<p>Which of the following is an acceptable strategy when discussing weaknesses in an interview?</p>

Which of the following is an acceptable strategy when discussing weaknesses in an interview?

Your Strengths and Weaknesses II

What do you tell an interviewer when they ask you what your strengths and weaknesses are?

Your Strengths and Weaknesses

Brainteasers

<p>When an interviewer asks you a question along the lines of:</p>

<ul>
<li>Why did you apply to our company?</li>
<li>What are you looking for in your next job?</li>
<li>What makes you a good fit for our company?</li>
</ul>

<p>How should you respond?</p>


<p>When asked 'What are you looking for in your next job?' in an interview, how can you tie the company's employee benefits into your response?</p>

When asked 'What are you looking for in your next job?' in an interview, how can you tie the company's employee benefits into your response?

Why Do You Want to Work With Us I

<p>How can company values be used effectively in an interview when asked 'What makes you a good fit for our company?'</p>

How can company values be used effectively in an interview when asked 'What makes you a good fit for our company?'

Why Do You Want to Work With Us II

<p>When responding to the question 'Why did you apply to our company?' during an interview, what aspect should you highlight?</p>

When responding to the question 'Why did you apply to our company?' during an interview, what aspect should you highlight?

Why Do You Want to Work With Us III

How would you answer when an Interviewer asks why you applied to their company?

Why Do You Want to Work With Us

<p>Describe a data project you worked on. What were some of the challenges you faced?</p>


Describing a data project and its challenges

Hurdles In Data Projects

Analytics

<p>How would you explain what a p-value is to someone who is not technical?</p>


<p>What does a p-value in a statistical test represent?</p>

What does a p-value in a statistical test represent?

P-value to a Layman I

<p>In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?</p>

In a statistical test, how does a low p-value (less than 0.05) influence our decision about the null hypothesis?

P-value to a Layman II

<p>What are the assumptions of linear regression?</p>


Which assumption of the residuals of the standard linear regression model can not be overcome by increasing the sample size?

Regression assumptions

<p>Let’s say that you’re training a classification model.</p>

<p>How would you combat overfitting when building tree-based models?</p>


Let's say that you're training a classification model.   How would you combat overfitting when building tree-based models?

Overfit Avoidance

<p>How would you handle the data preparation for building a machine learning model using imbalanced data?</p>


Addressing imbalanced data in machine learning through carefully prepared techniques.

Data Preparation for Imbalanced Data

<p>Write a SQL query to select the 2nd highest salary in the engineering department.</p>

<p><strong>Note:</strong> If more than one person shares the highest salary, the query should select the next highest salary.</p>

<p><strong>Example:</strong></p>

<p><strong>Input:</strong></p>

<p><code>employees</code> table</p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>first_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>last_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>department_id</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
<p><code>departments</code> table</p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>name</code></td>
<td>VARCHAR</td>
</tr>
</tbody>
</table>
<p><strong>Output:</strong></p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>


Select the 2nd highest salary in the engineering department

2nd Highest Salary

<p>Let’s say that you work at a B2B SAAS company that’s interested in testing the pricing of different levels of subscriptions.</p>

<p>Your project manager comes to you and asks you to run a two-week-long A/B test to test an increase in pricing.</p>

<p>How would you approach designing this test?  How would you determine whether the increase in pricing is a good business decision?</p>


Testing Price Increase

<p>A team wants to A/B test multiple different changes through a sign-up funnel.</p>

<p>For example, on a page, a button is currently red and at the top of the page. They want to see if changing a button from red to blue and/or from the top of the page to the bottom of the page will increase click-through.</p>

<p>How would you set up this test?</p>


<p>Your team wants to run an AB test on the color (red or blue) and position (top/bottom of the page) of a button that links to a promotion.</p>

<p>How many different variants of the test should you run to see the effect that a red button has on the bottom of the page, if currently the button is blue and on the top of the page.</p>


Your team wants to run an AB test on the color (red or blue) and position (top/bottom of the page) of a button that links to a promotion.

How many different variants of the test should you run to see the effect that a red button has on the bottom of the page, if currently the button is blue and on the top of the page.

A/B Test Variants

<p>Let’s say we want to launch a re-design of a landing page to improve the click-through rate. We can do this by implementing an AB test.
Given that we launch an AB test, how would you infer if the results of the click-through rate were statistically significant or not?</p>


<p>How can you accurately conclude if the results of an A/B test, conducted to evaluate the effectiveness of a landing page redesign, are statistically significant?</p>


How can you accurately conclude if the results of an A/B test, conducted to evaluate the effectiveness of a landing page redesign, are statistically significant?

Statistically Significant Test

Precisely ascertain whether the outcomes of an A/B test, executed to assess the impact of a landing page redesign, exhibit statistical significance.

<p>Let’s say that your company is running a standard control and variant AB test on a feature to increase conversion rates on the landing page. The PM checks the results and finds a .04 p-value.</p>

<p>How would you assess the validity of the result?</p>


<p>Let’s say you work at Allstate. Allstate is running <code>N</code>  online ads right now. The table <code>ads</code> contains all those ads, ranked by popularity via the <code>id</code> column (e.g., the entry with <code>id = 1</code> is the most popular, etc.).</p>

<p>Create a subquery or common table expression named <code>top_ads</code> containing the top 3 ads (by popularity) and return the number of rows that would result from the following operations</p>

<ol>
<li><code>ads INNER JOIN top_ads</code></li>
<li><code>ads LEFT JOIN top_ads</code></li>
<li><code>ads RIGHT JOIN top_ads</code></li>
<li><code>ads CROSS JOIN top_ads</code></li>
</ol>

<p><em>Note: Please make the <code>join_type</code> column in your output have the values <code>inner_join</code>, <code>left_join</code>, etc. for each of their respective join types</em></p>

<p><em>Note: Please return only one query with each number in a different row</em></p>

<p><strong>Example:</strong></p>

<p><strong>Input:</strong></p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>name</code></td>
<td>VARCHAR</td>
</tr>
</tbody>
</table>
<p><strong>Output:</strong></p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>join_type</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>number_of_rows</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>


Size of Joins

<p>Regularization and cross-validation are two common techniques used to improve the performance of machine learning algorithms.</p>

<p>When should you use one versus the other?</p>


<p>Given the choices below, which one is an example of a scenario where we should be using regularization?</p>


Given the choices below, which one is an example of a scenario where we should be using regularization?

Regularization Example

Regularization and Validation

<p>What’s the difference between Lasso and Ridge Regression?</p>


<p>Under what circumstances is it recommended to use Lasso regression over Ridge regression?</p>

Under what circumstances is it recommended to use Lasso regression over Ridge regression?

<h1>Binary Tree Validation</h1>

<p>You are given the root of a binary tree. You need to determine if it is a valid binary search tree (BST).</p>

<p>A valid BST is defined as follows:</p>

<ul>
<li>The left subtree of a node contains only nodes with values less than or equal to the node’s value.</li>
<li>The right subtree of a node contains only nodes with values greater than or equal to the node’s value.</li>
<li>Both the left and right subtrees must also be binary search trees.</li>
</ul>

<p>Given the function <code>def is_valid_bst(root: Node) -&gt; bool:</code>, return True if the binary tree is a valid BST. Otherwise, return False.</p>

<h3>Example:</h3>

<p><strong>Input:</strong></p>

<p><img src="https://d2qpirhrfplx04.cloudfront.net/9664817c-3e12-4a40-9236-8382b13f55a0.png" alt="image"/>Converted Binary Tree.png</p>

<p><strong>Output:</strong></p>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">is_valid_bst</span><span class="p">(</span><span class="n">Node</span><span class="p">(</span><span class="mi">3</span><span class="p">))</span> <span class="o">-&gt;</span> <span class="kc">True</span>
</span></span></code></pre>


Given the root node, verify if a binary search tree is valid or not.

Binary Tree Validation

<p>What’s the relationship between PCA and K-means clustering?</p>


<p>What does the variable “k” in k-means clustering refer to?</p>


What does the variable "k" in k-means clustering refer to?

Input of K-means

PCA and K-Means

<p>Let’s say you’re a data engineer at Fidelity Investments, and you’re running a SQL query on a cloud-based data warehouse. All cluster resources and network health metrics look normal, but the query is still taking over 10 minutes to complete.</p>

<p>How would you go about diagnosing and improving the performance of this query?</p>


How would you diagnose and speed up a slow SQL query when system metrics look healthy?

Slow SQL Query

Query Optimization

<p>Let’s say you work as a data scientist at a bank.</p>

<p>You are tasked with building a decision tree model to predict if a borrower will pay back a personal loan they are taking out.</p>

<ol>
<li><p>How would you evaluate whether using a decision tree algorithm is the correct model for the problem?</p></li>

<li><p>Let’s say you move forward with the decision tree model. How would you evaluate the performance of the model before deployment and after?</p></li>
</ol>


Decision Tree Evaluation

<p>Given the <code>employees</code> and <code>departments</code> table, write a query to get the top 3 highest employee salaries by department. If the department contains less that 3 employees, the top 2 or the top 1 highest salaries should be listed (assume that each department has at least 1 employee). </p>

<p><em>Note: The output should include the full name of the employee in one column, the department name, and the salary. The output should be sorted by department name in ascending order and salary in descending order.</em> </p>

<p><strong>Example:</strong></p>

<p><strong>Input:</strong></p>

<p><code>employees</code> table</p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>first_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>last_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>department_id</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>
<p><code>departments</code> table</p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>id</code></td>
<td>INTEGER</td>
</tr>

<tr>
<td><code>name</code></td>
<td>VARCHAR</td>
</tr>
</tbody>
</table>
<p><strong>Output</strong>:</p>

<table>
<thead>
<tr>
<th>Column</th>
<th>Type</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>employee_nam</code>e</td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>department_name</code></td>
<td>VARCHAR</td>
</tr>

<tr>
<td><code>salary</code></td>
<td>INTEGER</td>
</tr>
</tbody>
</table>


Get the top 3 highest employee salaries by department

Top Three Salaries

<p>You’re given a string that may contain the characters <strong><code>{</code></strong>, <strong><code>}</code></strong>, <strong><code>[</code></strong>, <strong><code>]</code></strong>, <strong><code>(</code></strong>, and <strong><code>)</code></strong>.</p>

<p><strong>Task:</strong> Verify that the string is balanced. A balanced string is one where every opening character, <strong><code>{</code></strong>, <strong><code>[</code></strong>, or <strong><code>(</code></strong>, has a corresponding closing character, <strong><code>}</code></strong>, <strong><code>]</code></strong>, or <strong><code>)</code></strong>.</p>

<p>Write a function called <strong><code>is_balanced(string: str) -&gt; bool</code></strong> which verifies the balance of a string.</p>

<p><strong>Example:</strong></p>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="n">is_balanced</span><span class="p">(</span><span class="s1">&#39;(())[]</span><span class="si">{}</span><span class="s1">&#39;</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">True</span>
</span></span></code></pre>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="n">is_balanced</span><span class="p">(</span><span class="s1">&#39;{([()</span><span class="si">{}</span><span class="s1">])()}&#39;</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">True</span>
</span></span></code></pre>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="n">is_balanced</span><span class="p">(</span><span class="s1">&#39;</span><span class="si">{}</span><span class="s1">[]())&#39;</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">False</span>
</span></span></code></pre>

<hr/>


Write a function that tests whether a string of brackets is balanced.

The Brackets Problem

<p>In the context of hypothesis testing, what are type I errors (type one errors) and type II errors (type two errors)? What is the difference between the two?</p>

<p><em>Bonus: Describe the probability of making each type of error mathematically.</em></p>


What is the difference between type I and type II errors?

Type I and II Errors

<p>Given two sorted lists, write a function to merge them into one sorted list.</p>

<p><em>Bonus: What’s the time complexity?</em></p>

<p><strong>Example:</strong></p>

<p><strong>Input:</strong></p>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="n">list1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">5</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="n">list2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">6</span><span class="p">]</span>
</span></span></code></pre>

<p><strong>Output:</strong></p>

<pre tabindex="0" class="chroma"><code><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">merge_list</span><span class="p">(</span><span class="n">list1</span><span class="p">,</span><span class="n">list2</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">]</span>
</span></span></code></pre>


<p>You have two sorted lists, one with m elements and one with n elements. What would be the time complexity of merging the two lists via merge sort?</p>


You have two sorted lists, one with m elements and one with n elements. What would be the time complexity of merging the two lists via merge sort? 

Merging Lists Complexity

<p>Let’s say you want to test the close friends feature on Instagram Stories.</p>

<p>How would you make a control group and test group to account for network effects?</p>


<p>What could be a potential risk when Facebook segments the metrics of the user interface change by market or demographic groups?</p>

What could be a potential risk when Facebook segments the metrics of the user interface change by market or demographic groups?

Network Experiment Design I

<p>What is the primary metric that Facebook should monitor if it decides to make the user interface of its posting feature more like Instagram's?</p>

What is the primary metric that Facebook should monitor if it decides to make the user interface of its posting feature more like Instagram's?

Network Experiment Design II

<p>If Facebook changes its user interface to mimic Instagram's, which adverse effect should they anticipate and monitor?</p>

If Facebook changes its user interface to mimic Instagram's, which adverse effect should they anticipate and monitor?

Network Experiment Design III

<p>Let’s say you’re analyzing an AB test that has both a test group and a control group.</p>

<ol>
<li><p>How do you calculate the sample size necessary for an accurate measurement?</p></li>

<li><p>Let’s say that the sample size is similar and sufficient between the two groups. In order to measure very small differences between the two, should the power get bigger or smaller?</p></li>
</ol>


<p>What is a confidence interval for a statistic? Why is it useful to know the confidence interval for a statistic and how do you calculate it?</p>


What are confidence intervals and how are they useful

Confidence Interval Explanation

<p>Let’s you’re tasked with pitching a new feature for Google Home. Your co-worker comes to you with an idea to build a game feature for Google Home.</p>

<p>How would you go about deciding whether Google should build it?</p>


Game Feature Home

Business Case

<p>Let’s say we want to build a new delivery time estimate model for consumers ordering food delivery.</p>

<p>How would you determine if the new model predicts delivery times better than the old model?</p>


<p>When rolling out a new delivery time estimate model in production, what is an important consideration?</p>

When rolling out a new delivery time estimate model in production, what is an important consideration?

Delivery Estimate Model I

<p>What is one way to measure the performance of a new delivery time estimate model for food delivery?</p>

What is one way to measure the performance of a new delivery time estimate model for food delivery?

Delivery Estimate Model II

<p>How would you choose the <em>k</em> value when using <em>k</em>-means clustering?</p>


choosing k value during k-means clustering

Choosing k

<p>Let’s say we’re testing a new UI with the goal to increase conversion rates. We test it by giving the new UI to a random subset of users.</p>

<p>The test variant wins by 5% on the target metric. What would you expect to happen after the new UI is applied to all users? Will the metric actually go up by ~5%, more, or less?</p>

<p><em>Note: Assume there is no novelty effect.</em></p>


New UI Effect

Challenge

<h3>Job Description</h3><div>Job Description<div><p><strong>Overview</strong></p><div><div><div><div><div><div><p>BigBear.ai is seeking a Data Scientist Subject Matter Expert (SME) to lead a team of data scientists and engineers conducting data analytics, data engineering, data mining, exploratory analysis, predictive analysis, and statistical analysis using scientific techniques to correlate data into graphical, written, visual, and verbal narrative products to enable more informed analytic decisions. The successful candidate will demonstrate a refined ability to define problems, supervise studies, and lead surveys to collect and analyze data to provide advice and recommend solutions. Additionally, the candidate will also demonstrate analytic leadership and expertise in identifying, planning, developing, and executing analytic production methodologies, tradecraft, and techniques aligned with the customer’s mission. This position will be based out of Washington, D.C. or the greater National Capital Region (NCR) and is an opportunity to get in on the “ground level” of a new and exciting program with one of our customer. </p></div></div></div></div></div></div><div><div><div></div></div></div><br /><p><strong>Responsibilities</strong></p><div><div><div><div><div><div><p>In addition to the above, duties for this position typically include: creating various ML-based tools or processes, such as recommendation engines or automated lead scoring systems. Perform statistical analysis, apply data mining techniques, and build high quality prediction systems. The successful candidate should be skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau; major data science languages, such as R and Python; managing and merging of disparate data sources, preferably through R, Python, or SQL; statistical analysis; and data mining algorithms. Additionally, the candidate should have prior experience with large data Multi-INT analytics, ML, and automated predictive analytics.</p></div></div></div></div><div><div></div></div></div></div><br /><p><strong>Qualifications</strong></p><ul><li><p>Must possess a TS/SCI clearance with a CI poly</p></li><li><p>Master’s degree in data science, data engineering, mathematics, or another related field (an additional 5 years of experience may be substituted for this requirement)</p></li><li><p>Minimum of 12 years of experience conducting analysis using data science or engineering, with at least a portion of experience within the last 2 years</p></li><li><p>Deep understanding and experience using Python, R, and/or SQL</p></li><li><p>Prior experience with large data, multi-INT analytics, machine learning, and automated predictive analytics</p></li><li><p>Demonstrable leadership and management experience on teams of junior to mid-range data scientists, engineers, or analysts</p></li></ul></div></div>

<h3>Job Description</h3><div>Job Description<div><p><strong>Overview</strong></p><p><strong>BigBear.ai</strong> is seeking a <strong>Software Engineer </strong>to join an ever-evolving project where you will design, develop, and deploy mission-critical tools and solutions. This role involves working closely with customers to understand and support operational requirements, as well as analyzing large data sets to deliver mission-centric insights. You will collaborate with analysts, data scientists, and other software engineers to integrate cutting-edge technologies and solutions that drive impactful results.</p><br /><p><strong>Responsibilities</strong></p><ul><li><strong>Software Development:</strong> Write production-grade software to be deployed in operational environments</li><li><strong>Maintenance & Monitoring:</strong> Monitor software performance and implement updates as needed</li><li><strong>Platform Enhancement:</strong> Augment the platform with new tools and technologies to meet evolving mission needs</li><li><strong>Code Reviews:</strong> Perform thorough code reviews to ensure quality and compliance</li><li><strong>Stakeholder Collaboration:</strong> Elicit requirements from stakeholders to align solutions with mission objectives</li><li><strong>Documentation:</strong> Write detailed documentation to support compliance processes and operational workflows</li></ul><br /><p><strong>Qualifications</strong></p><ul><li><strong>Clearance:</strong> Must possess and maintain an active <strong>TS/SCI w/ Polygraph</strong></li><li><strong>Education & Experience:</strong> Bachelor's degree in a relevant field, plus <strong>production-grade software development experience</strong> in <strong>C/C++, Python, and Java</strong></li><li><strong>Version Control:</strong> Familiarity with software version control systems, such as <strong>Git</strong></li><li><strong>UNIX/Linux Proficiency:</strong> Experience using the command line in <strong>UNIX or Linux environments</strong></li><li><strong>Networked Systems:</strong> Experience working in multi-node networked environments</li></ul></div></div>

<h3>Job Description</h3><div>Job Description<div><p><strong>Overview</strong></p><p><strong>BigBear.ai</strong> is hiring full stack and specialized software engineers. As a Software Engineer, you will thrive in a dynamic environment where your contributions directly enhance the experience for our customers. The organization is focused on discovery of opportunities and enabling activities in numerous technical fields. You will spearhead the design and delivery of cutting-edge custom applications that drive powerful data visualization, streamline workflow automation, and enhance mission cognizance for vital operations. You will work closely with the mission customer and provide support as needed on a wide variety of tasks.</p><p>This position will report to an on-site location in Annapolis Junction, MD; an active TS-SCI clearance with Poly is required.</p><br /><p><strong>Responsibilities</strong></p><ul><li>Craft innovative scripts and analytics that unlock transformative insights into complex challenges.</li><li>Discover and synthesize data from diverse sources to seamlessly integrate into both new and existing tools, enhancing our capabilities.</li><li>Innovate and design intuitive UIs and visualizations that not only fill gaps in our customer toolsets but elevate the overall user experience.</li><li>Elicit requirements and feedback directly from end-users and other stakeholders.</li><li>Collaborate with other teams to provide expertise where needed and leverage the experience of the broader team.</li></ul><br /><p><strong>Qualifications</strong></p><ul><li>A relevant Technical Degree coupled with 5+ years of hands-on experience in a demanding environment is essential.</li><li><strong>Clearance: Must possess and maintain a TS-SCI clearance with a polygraph</strong></li><li>Proven experience with a variety of programming languages and cutting-edge technology stacks, including:<div>Java, C#, C/C++, Python, Spring, JavaScript, React, Django, Docker, Kubernetes, Git, Subversion, Jira, Confluence, databases (graph, relational, noSQL)</div></li></ul></div></div>

<h3>Job Description</h3><div>Job Description<div><p><strong>Overview</strong></p><p>BigBear.ai is seeking a Senior Data Scientist to conduct data analytics, data engineering, data mining, exploratory analysis, predictive analysis, and statistical analysis using scientific techniques to correlate data into graphical, written, visual, and verbal narrative products to enable more informed analytic decisions. The successful candidate will proactively retrieve information from various sources, analyze it for better understanding, and assist with building AI tools that automate key processes. This position will be based out of Washington, D.C. or the greater National Capital Region (NCR). This is an opportunity to get in on the “ground level” of a new and exciting program with a critical Department of War (DoW) customer.</p><br /><p><strong>Responsibilities</strong></p><p>Duties typically include: creating various ML-based tools or processes, such as recommendation engines or automated lead scoring systems. Perform statistical analysis, apply data mining techniques, and build high quality prediction systems. The successful candidate should be skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau; major data science languages, such as R and Python; managing and merging of disparate data sources, preferably through R, Python, or SQL; statistical analysis; and data mining algorithms. Additionally, the candidate should have prior experience with large data Multi-INT analytics, ML, and automated predictive analytics.</p><div><div><div><div></div></div></div></div><br /><p><strong>Qualifications</strong></p><ul><li><p>Must possess a TS/SCI clearance with a CI poly (or ability to obtain)</p></li><li><p>Bachelor’s degree in data science, data engineering, mathematics, or another related field</p></li><li><p>Minimum of 8 years of experience conducting analysis using data science or engineering, with at least a portion of experience within the last 2 years</p></li><li><p>Deep understanding and experience using Python, R, and/or SQL</p></li><li><p>Prior experience with large data, multi-INT analytics, machine learning, and automated predictive analytics</p></li></ul></div></div>

<h3>Job Description</h3><div>Job Description<div><p><strong>Overview</strong></p><p>BigBear.ai is seeking a Data Scientist to conduct data analytics, data engineering, data mining, exploratory analysis, predictive analysis, and statistical analysis using scientific techniques to correlate data into graphical, written, visual, and verbal narrative products to enable more informed analytic decisions. The successful candidate will proactively retrieve information from various sources, analyze it for better understanding, and assist with building AI tools that automate key processes. This position will be based out of Washington, D.C. or the greater National Capital Region (NCR). This is an opportunity to get in on the “ground level” of a new and exciting program with a critical Department of War (DoW) customer.</p><br /><p><strong>Responsibilities</strong></p><p>Duties typically include: creating various ML-based tools or processes, such as recommendation engines or automated lead scoring systems. Perform statistical analysis, apply data mining techniques, and build high quality prediction systems. The successful candidate should be skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau; major data science languages, such as R and Python; managing and merging of disparate data sources, preferably through R, Python, or SQL; statistical analysis; and data mining algorithms. Additionally, the candidate should have prior experience with large data Multi-INT analytics, ML, and automated predictive analytics. </p><br /><p><strong>Qualifications</strong></p><ul><li><p>Must possess a TS/SCI clearance with a CI poly (or ability to obtain)</p></li><li><p>Bachelor’s degree in data science, data engineering, mathematics, or another related field</p></li><li><p>Minimum of 3 years of experience conducting analysis using data science or engineering (an additional 4 years of experience may be substituted for the above education requirements)</p></li><li><p>Deep understanding and experience using Python and/or R</p></li><li><p>Prior experience with large data, multi-INT analytics, machine learning, and automated predictive analytics</p></li></ul></div></div>

Calculate Moving Average	SQL	Easy
Predict Customer Churn	Machine Learning	Medium
A/B Test Significance	Statistics	Medium
Optimize Query Performance	SQL	Hard
Feature Importance Analysis	Machine Learning	Medium
Clean Missing Data	Python	Easy
Neural Network Architecture	Deep Learning	Hard
Calculate Cohort Retention	SQL	Medium
Bayesian Probability	Statistics	Easy
Recommend Similar Products	Machine Learning	Hard

Bigbear.ai Interview Questions

Bigbear.ai Interview Guides

Bigbear.ai Interview Questions

Challenge

Bigbear.ai Opening Jobs

Discussion & Interview Experiences

Discussion & Interview Experiences