My experience as a new grad trying to become a data scientist was terrible. I was a senior in college graduating with an electrical engineering degree. The last thing I wanted to do was touch an oscilloscope or look at a printed circuit board for eight hours a day. Data science was the hottest new trend, and I was willing to do anything to get my foot in the door as a data scientist. My problem was the same as every other new grad however: I had zero experience.
One year later, I found myself as the first data scientist at a ten-person startup with only three other engineers on the team. Officially, I was hired as a data scientist, but my role after day one turned out to be surprisingly more geared toward software engineering. They had hired another data scientist a few months prior but had to fire him on his first day when he didn’t want to deploy his own models. That process just didn’t work when we only had four engineers including the CTO.
Now I’m five years out of college as a data scientist and still have not had a Ph.D. or Masters’s in order to have worked in data science. I want to share a complete guide into how it’s possible and what it takes to become a data scientist out of new grad, especially in this day and age of a recession caused by Covid-19. As difficult as it might seem to be a data scientist as a new graduate just out of college and without a PHD or masters, it’s still very doable!
All it takes, after all, is landing your first data science job!
(But mainly because you don’t have any)
Let me recount a quick story.
During my senior year, I attended a STEM career fair trying to look for jobs. I stumbled across one company’s booth that was advertising they were hiring a variety of software positions and some data scientist roles. Let’s call them hotStartupB2b.
I ended up talking to someone who was an engineer. He took my resume and slowly started shaking his head as he read it.
“If you’re in undergrad, you probably don’t have enough experience for the data scientist role”, he said.
“Yeah I’m trying to get more data science experience,” I replied nervously.
“Sorry, but generally all of our data scientists have PHDs,” he snarked.
“Okay, but you’d think with such a high demand the role is going to balance out the insane requirements right now,” I argued. “I mean the number of job openings for data scientists outstrip the number of people who have PHDs by a huge factor.”
“I disagree. It’s like any other occupation where you the training and skills. Think about a pilot. You still need to go through the training to become a pilot. You can’t just fly the plane without being qualified because there are actual real consequences. Same thing with becoming a data scientist.”
He then took my resume, and staring right at me, threw it into a trash can.
Just kidding. But he might as well have because after that encounter I thought he must have been some sort of crazed data scientist exclusive defender. What kind of person says that shit. He was brutal with his reasoning in defending his company’s position on hiring for experience and maybe it was well thought out from hiring junior data scientists before and then firing them but honestly… PROBABLY NOT!!
No one knows what a data scientist is generally supposed to do and how much experience they should have for a job that is mainly to solve business problems using data. And like all new things that come up, people want the best and brightest if they’re going to risk hiring a few roles versus the many software engineers that exist. What the career fair smug jerk was really trying to say was that it’s hard to trust new grads with zero industry experience to get put in a role where they have to either deploy machine learning models or communicate with the business teams effectively how to fix their product.
It makes looking for a data scientist job hard because of three things:
So really, the best power you have is your biggest weakness; no experience means you need an extreme eagerness to learn.
This means being interested in learning different tools, trying Kaggle competitions, analyzing datasets, and seeking out opportunities to put yourself in situations to quickly grow your skills.
Back to my journey. I lived off-campus in Seattle and would always have to walk a good 15 minutes from the school buildings back home each day. One day I uncharacteristically took my eyes off my phone and looked up at the sky.
Damn, I thought, there are a lot of new apartment buildings being built. I wonder how much rent is now in Seattle?
Friends had complained about rental prices rising in the university district by UW. Another friend had told me Capitol Hill was by far the most expensive part of Seattle to live in. I asked him by how much.
“I mean who knows man, it’s probably like, 3 times as more expensive!”
Pfft. I thought. He insisted. So my head started churning. I bet I could figure it out. But, exactly how?
Turns out, for one of my projects working as a part-time researcher for a finance PhD student, I was learning how to scrape data from different websites using the Python Scrapy package. Why couldn’t I just use the same skills to scrape Craigslist housing data and run a regression on the price to look at the features for which neighborhoods cost the most?
Guess what. I did just that. And I wrote about it in a blog that got to the front page of HackerNews. This leads me to my second point…
Imagine you’re a CEO of a startup. Imagine that this startup sells data science work to other companies. Data science is very in demand, but unfortunately, no one knows your startup exists. How do you market your company to other companies?
Well, you could try paid marketing on Facebook and Google, but that’s kind of expensive and you don’t have any money. Email marketing works as well, but you gotta take the time to collect a list of people. Plus they might think you’re spamming them.
What about blogging? Or social media marketing? Or providing value to the company by giving them some free information at first? This is marketing.
At the end of the day, when you are looking for a job, you are essentially selling yourself as a product. You are your own product, providing value to a business that is paying you money. Your resume is a description of your product, along with favorable reviews from your past customers. If you don’t have any past customers, it’s going to be harder! But it won’t be impossible.
Remember how I couldn’t imagine working on an oscilloscope or PCB board? Well, that eventual boredom and hate of electrical engineering translated into my real college life, and I got a sweet 2.3 in one of my circuits classes. Have you ever had a low grade in a class? UW graded on a bell curve, and I distinctly remember looking at my test paper and then looking up to see it circled on one end of a normal distribution.
But the reason why I had the low was that I was spending exactly ZERO brain hours dedicated to learning electrical engineering. I had just discovered the pandas library, and it had opened up a star in what was the black hole of my brain. Data was suddenly so easy to just read in and manipulate, munge, join, and produce insights and analytics that I found an endless number of interesting projects that I could jump into.
Apartment rental pricing was just the beginning. I refined my scraping skills from Craigslist to scraping basketball statistics, Reddit posts, and anything and everything. If it was HTML, I would parse it like no other. I was so down for these projects. So I was working on the baseline; improving my data skills in Python and R, trying cool projects, and scraping data, but I still didn’t have too much on my resume in data land. I wanted to land an internship in data science so it could set me up better for a data-related job, but it never materialized.
I started applying for jobs and writing intense cover letters about my love for data.
If you hire me, I will love data so much that you’ll see me in the office until 2 am wrangling the hell out of data!
No calls back. Why? I thought it was because they didn’t understand my intense emotional connection to data. I needed to prove it somehow. I figured my cover letters were hitting the infamous black hole of recruiting software. Buried deep into the system, it was time to try to attend a career fair again and network with the employees.
I attended a career fair where I talked to an employee at Socrata. He was very supportive of my data science ambition but only had software engineering internships available. I agreed to try it out anyway. A week later a recruiter from Socrata sent me a coding assignment to complete. I completed it, turned it in, and got a very quick rejection email. This began my hate of take-home challenges. They never told me why I failed them, and I got zero feedback on what was wrong with my code.
Two months later I found this cool Seattle police reports dataset. Lo and behold, it was provided by none other than the company Socrata. I downloaded the past few years of crime data, overlaid it on some pretty maps of Seattle, built a generic model to predict the urgency of crime, and after a week of it threw it on my blog and went to class.
The next day someone from the local news saw that the Chief of Police in Seattle retweeted my article and wanted to reach out and do a quick news segment. Then Geekwire featured a story on it. Suddenly, Socrata wanted me to come onsite for an interview again. The engineer I pair programmed with on the interview got to the first method of my code, and we sat there stuck for an hour on why I used the Python append method instead of list comprehension. I probably failed the entire interview, but it didn’t matter. The only thing that mattered was that at the end of the day, the VP of marketing came up to me and said:
“So….we were just wondering, why did you feel the need to write a blog about this dataset?”
I shrugged my shoulders. He smiled. I got an internship there the next day.
Marketing, marketing, marketing. It’s all we do to sell anything. It’s what I’m doing right now to sell this data science prep product to you! I got lucky, and I used a combination of my writing and interest to go viral on the internet with some data as a naive data science kid. But that was it. This is what I did:
The day after the Seattle local news interviewed me I got ten-plus emails from techies in the area that reached out to see if I wanted to work with them. It was like I was drowning in job offers and demand. SpaceX, Rover.com, Tune, and all these random startups and companies were offering me chances to interview because they could see how interested I was in data science. They didn’t give a shit about experience or qualifications. All they wanted was a verification of my dedication to the field and craft. And they saw it through reading the blog post.
Now you know the secrets.