JP Morgan Chase is one of the leading financial services companies worldwide, with assets totaling over $2 trillion. Their data science team is a hub for innovation and research, dedicated to resolving critical financial issues.

Data scientists at the company work in a variety of roles and teams, including fraud detection, business process automation, and customer experiences. With a heavy focus on machine learning, data modeling, and analytics, they leverage big data to drive business impact and improve financial services for a range of clients.

Write a function called ** find_bigrams** that takes a sentence or paragraph and returns a list of all its bigrams in order. A bigram is a pair of consecutive words.

You have a large 100 GB log file and want to determine the total number of lines it contains. Describe how to do this in Python.

Given a text document in the form of a string, write a program to determine the `term_frequency`

(TF) values for each term, rounding the TF to two decimal points.

*To practice coding interview questions, consider using the Python learning path or the full list of coding questions in our database.*

Let’s say you work for a bank that gives out personal loans. Your co-worker develops a model that takes in customer inputs and returns if a loan should be given or not.

Another co-worker believes that they have a better model for predicting loan defaults. Since personal loans have monthly payment installments, how would you compare the performance of these two credit risk models over time? What success metrics would you track for the new model?

Suppose you’re comparing two machine learning algorithms. In what situation would you prefer using a bagging algorithm versus a boosting algorithm? Give an example illustrating the trade-offs between these two algorithms.

You’re a data scientist at a bank tasked with creating a decision tree model to predict if borrowers will repay their personal loans. How would you evaluate whether the decision tree algorithm is the best fit for this task? If you proceed with this model, how would you assess its performance before and after deployment?

*To get ready for machine learning interview questions, we recommend taking the machine learning course.*

You are working for an e-commerce store where the new-user-to-customer conversion rate increased from 40% to 43% after a redesigned email journey. A few. months prior, the conversion rate had dropped from 45% to 40%. Analyze and determine whether the increase in the conversion rate was due to the redesigned email campaign or other factors.

Your company is looking to create a new partner card (e.g. Starbucks Chase credit card) with a different organization. Given customer spending data, how would you decide what the partner card should be?

You work for a credit card company that wants to connect with more businesses that accept their credit card. There are 100,000 merchants you can reach out to, but your team currently only has the capacity to connect with 1000 of them. How would you determine which businesses are your best options?

*To practice for case studies, check out the product metrics learning path and the data analytics learning path.*

You’re given a table representing a company’s payroll schema that accidentally did an insert when adjusting salaries for the year. Write a query to get the current salary for each employee.

*Note:**There are no employees with duplicate full names, and assume the INSERT operation works with ID autoincrement.*

Given an ** annual_payments** table, answer the following:

- How many total transactions are in this table?
- How many different users made transactions?
- How many transactions listed as
`"paid"`

have an amount greater or equal to 100? - Which product made the highest revenue? Use only transactions with a
`"paid"`

status.

Given a ** transactions** table representing individual user product purchases, find the number of customers who made additional purchases after their initial one, excluding those who bought multiple products on the same day.

*To continue practicing SQL interview questions, try the SQL learning path and the full list of SQL questions and solutions in our interview questions database.*

In a certain casino dice game, you roll a die once. If you re-roll, you earn the amount equal to the number on your second roll. If you don’t, you’re given the amount of your first roll. Assuming that you follow a profit-maximizing strategy, what would be your expected win amount?

Imagine a deck of 500 cards numbered from 1 to 500. If all the cards are shuffled randomly, and you are asked to pick three cards, one at a time, what’s the probability that each subsequent card is larger than the previous one?

You’re given a coin that produces heads 30% of the time. What is the probability that the coin lands as heads exactly 5 times out of 6 tosses?

*To prepare for Probability and Statistics interview questions, we recommend the statistics and A/B testing learning path and the probability learning path. These resources cover a wide range of topics, from basic probability concepts to advanced statistical analysis techniques.*

Practice for the JP Morgan Chase interview with these recently asked interview questions.

Most data science positions fall under different position titles depending on the actual role.

From the graph we can see that on average the Machine Learning Engineer role pays the most with a $142,188 base salary while the Product Analyst role on average pays the least with a $84,875 base salary.