I won’t be discussing the specific qualifications and skills you need to be a data scientist; there are many resources on this subject and it depends on what kind of work you’re interested in. Instead, I’ll talk about my journey into data science (DS) and the general mindset and habits that helped me break into the field. In the end, I’ll go over some takeaways that I hope will help you break into DS regardless of your background.
I never intended to transition into DS; neither did I have the traditional background or education to do so. Luckily, my engineering background taught me how to program and think critically, but more importantly how to learn and persevere. I learned almost everything on my own through reading papers and working on side projects. I also couldn’t have done this without my mentors and their honest and constructive feedback.
After a couple years of studying software engineering (SE), I joined Yelp as an SE intern, working on DS-related projects. Around a year later, I joined Uber as a DS intern and shortly graduated after that. This is my story of how I transitioned into DS, why I decided to switch back into SE, and what I’ve learned over this six year period.
In 2012, I started studying at the University of Waterloo majoring in mechatronic engineering. I was always fascinated by how you could positively impact peoples’ lives by building things that directly helped them. I initially thought it could only be done through building physical things, like robots, but I eventually realized you could achieve similar goals through software; that’s why in 2014 I switched into SE from mechatronics engineering.
Shortly after starting SE, I started hearing about machine learning (ML). My interest in ML would drive me to start learning about it in my free time, albeit at a surface level. In parallel, I continued to learn how to become a better software engineer, mostly through internships.
The beginning of my journey into ML started uneventfully. I failed to complete the infamous Machine Learning course by Andrew Ng; and failed to finish an undergraduate computer vision research project. At least I passed the Introduction to Statistics course, the only statistics course I’d ever take in university. Since statistics is a foundational component of ML and DS, at least one thing was going right.
This was an unproductive time in my transition into DS. I was more focused on securing an SE internship in the US. In the winter of 2015, I finally landed an internship at a startup in Mountain View, California. I built a simple recommendation system, using k-NN, and an analytics dashboard. Working on these projects showed me how data and analytics can be used to derive insights that help make great products. This piqued my interest enough that I finally became more serious about DS and ML.
In the fall of 2015, I landed an SE internship at Yelp. I joined the traffic quality team, which had a broad goal of identifying and preventing fraud and abuse. I was lucky to have gotten to work on DS-related projects, even though I was hired as an SE intern.
I struggled a bit during the internship, but I learned a lot while I was there. I learned about supervised and unsupervised ML, statistical model building, how to conduct a rigorous exploratory analysis, and the infrastructure used to manage large amounts of data. I learned that it’s critical to understand your data and the analysis methods, or else things may not work as expected. As an engineer, it usually suffices to treat methods and data as black boxes and abstractions — but this doesn’t always work in DS. For example, some methods and their parameters only work on specific types of data and entail certain assumptions.
At the time, I started reading ML papers so I could more effectively use these tools during my internship, like random forests, k-means and logistic regression. I didn’t really consider this a real DS internship because I lacked foundational knowledge, didn’t collaborate with many colleagues and needed a lot of guidance during the internship.
My experience at Yelp gave me the confidence to tackle more challenging projects. At our Yelp hackathon, my team and I built a logistic regression classifier to identify SLAPP businesses. This taught me that retrieving and massaging your data is just as important as the procedures or algorithms — if not more important. At another hackathon, my team and I built a chat bot for Messenger (before Messenger’s virtual assistant was a thing); it was able to answer queries and execute commands. In 2016, I worked with a postdoc for a few months to build a facial recognition system that runs on mobile devices using novel deep learning methods. For a school project, our team chose to build a conversation analysis tool on top of Messenger APIs that gave insights into different conversations, like sentiments, topics and frequent words.
After successfully completing these projects and doing another SE internship at Snap in the summer of 2016, I decided it was finally time to pursue something new. I thought it could be something in ML and not DS.
In the fall of 2016, I was considering only SE and ML internships. After attending the Uber DS information session, I realized it could be a great opportunity because of the interesting projects data scientists worked on and how talented the people seemed. I decided to apply. It would end up being the only DS position I ever applied for.
I still wasn’t that invested in the Uber DS internship for several reasons. I was focused on landing SE and ML internships; I didn’t have time to interview prep for DS interviews. I knew this DS internship was highly sought-after; there was only one position, but hundreds of applicants (this was visible on our university’s job application board). I was competing with many competent and passionate peers with a formal DS background. Although, one advantage of not being too invested was that during the interview process it gave me a lot of peace of mind; usually I would become anxious just thinking about interviews.
Shortly after applying to Uber, I was given the DS challenge. It involved writing SQL, designing an experiment, and conducting an exploratory analysis — all of which were related to Uber. This made it novel and interesting; I actually learned a few things while doing this challenge. After submitting my solutions, the recruiter reached out to schedule a one hour interview, which I felt went okay. After a few weeks, the recruiter told me I was their first pick for the internship — I was surprised and ecstatic!
I realized my experiences finally paid off — from the various SE internships, ML side projects, and DS work at Yelp. I’d say these experiences more than made up for my lack of a traditional DS background; they were what made my DS background unique.
At this point, I had to decide between an SE or DS internship. I saw DS as a way to grow my skills in a way that differentiated me from other software engineers, like learning more about analyses, the latest research, ML and statistics. I saw DS as an opportunity to learn a broader field than I originally intended. With everything I’ve learned over the past couple years, I suddenly realized I was well positioned to succeed at Uber. For these reasons, I decided to accept the Uber DS internship offer for the winter of 2017.
It was a great internship. I learned from some of the best in the industry and had the opportunity to work on interesting and challenging problems. It was similar to my Yelp experience, except there was more emphasis on independence, presentations, communicating results and collaboration. I was more confident in my DS abilities at Uber. This confidence helped me work on more ambitious DS projects after my internship, like our SE class profile. At this point, I was seriously considering going into DS full-time.
In the fall of 2017, I had my last internship which was an SE internship at WhatsApp. In 2018, I was graduating and the first question I had to answer was: should I go into DS or SE?
In the end, I decided to go with SE at WhatsApp. SE satisfied my desire to build things that impacted people. This feeling was rekindled during my internship at WhatsApp because of the ability to ship products that instantly impacted billions of users. I didn’t get that feeling as a data scientist because of the extra indirection to the end-user; but you do greatly influence the product through analyses and insights. I observed that the ratio of engineers to data scientists was usually several to one. SE was still a high in-demand field — with more positions, I thought it would provide more career stability.
I felt I was more strongly positioned in SE, compared to DS, because of my background and experience. There are many backgrounds that are well-suited for DS, which made it competitive in its own way. I saw that the best data scientists working on the most interesting problems typically had a PhD in physics, economics or operational research. If I wanted to achieve what they’ve achieved, I’d have to work really hard; I wasn’t sure if I was passionate enough about DS to do that. I didn’t see this as giving up on DS, but capitalizing on SE and my strengths.
It’s been two years since I decided to go into SE: I can say with confidence that it was the right decision. More importantly, I don’t regret the time I invested in DS, it was still a great experience and I’d do it all again in a heartbeat. If I could sum up my journey into a few takeaways, this is what I would say.
You should adapt your learning style to fit best with what you’re trying to accomplish. I found the best way to learn DS and ML is by reading research papers and working on real projects — actions speak louder than words. Build consistency around learning, like studying every day. Set concrete goals for what you want to learn and accomplish, like read one paper per week.
Ask them for honest and constructive feedback, especially in areas you want to excel in. If you don’t have a mentor, you can potentially find one through work or through mutual connections; usually, it’s best to have an introduction through a connection, rather than doing a cold email. Make sure you’re upfront, set clear expectations, and come to an agreement with your current and/or potential mentors.
Stability could mean being stable career-wise, financially, emotionally and/or physically. I was comfortable in engineering and where I was in life; this gave me the space to explore, experiment and fail in DS and ML. Stability relieves you from the stress and pressure to hastily succeed on your first attempt. Make sure you’re happy and established, it will make it easier to try something new if you have a safety net.
I tried out many things without much direction. Once the Uber DS opportunity arose, I realized I was well positioned to take the opportunity. Keep your options open and be patient, something great might be just around the corner.
Everybody has to start somewhere. It’s challenging because most of the time you’re expected to know how to do the job, even before you’ve been hired. This can be overcome by teaching yourself enough to get the job, then learn everything else on the job — just like what I did at Yelp and Uber. Learning on the job is usually higher quality because you get to solve real problems, have access to company resources, and collaborate with and learn from colleagues. After enough time and perseverance, you’ll eventually become the real deal and no longer need to fake it.
I hope my journey shows that, with a bit of hard work and serendipity, you can set yourself up to take advantage of those unexpected opportunities. Best of luck on everything you want to accomplish, and I hope this has helped!