Related Jobs Optimization
Let’s say that you work at a job board.
A PM wants to build a “related jobs” feature on every individual job description page. This “related jobs” feature would be a sidebar of jobs that are most related to the current job the user is browsing.
You have a couple of ideas on how to find related jobs. There’s NLP concepts such as bag of words and word embeddings. The PM also clarifies that the definition of “related jobs” are other jobs that are similar in position title and job description.
However when presented with the problem, you’re quickly realizing that there are millions of new jobs posted each day on the job board, and finding the top 10 related jobs for every single job posted each day could be extremely inefficient.
Explain a system / method that could solve the problem of finding the top 10 closest related jobs for millions of new jobs per day?
Note: Assume an existing pool of tens of millions of jobs that could be related for each new job. Also assume you have access to all text features of a job such as title, description, company, date, etc..Next question: WhatsApp Metrics