The academic trap and data science

How to get a data science job after academia without industry experience

graduation cap in a mouse trap

In recent years I’ve been going to college mathematics departments across the country to give talks on how to get into data science. My presentations are targeted at undergraduate math students who don’t know what they are going to do with the degree they’ll soon have (which also describes how I was as an undergrad). Despite the target audience, I always run into at least one person who has a graduate degree and wants help getting out of academia. The question they pose is usually something like:

I have been spending my life working towards being a tenured professor. I have spent years doing research and publishing papers, all for the goal of an academia life. But now I realize that isn’t what I want, and instead I would like to go into industry. I don’t have any industry experience; how do I get in?

They may be a PhD, a post-doc, an adjunct, or even tenure track professor, but regardless of their status the question is the same. These people are in a place I like to call the academic trap. The academic trap is when your career trajectory is so specialized for academia that you’re unprepared for a job outside of it. The academic trap happens in all areas of study, but for this post I will focus only on math and statistics students who want to leave academia for data science positions, since that’s what I am most familiar with.

Academia is place where many people are competing for few positions, and to get a position you need to put all your energy into becoming the best candidate. That means prioritizing writing papers over internships, making grants over learning programming languages, and not doing the things that could help you in industry but not academia. When someone who was been focusing on academia decides to go into industry they are at a serious disadvantage. The things that are importing for academia hiring, such as: papers, talks, and grants, are not things that are taken into considering when hiring in industry. Further, companies are often hesitant to hire people coming straight from academia for a number of reasons:

Besides companies hesitancy to hire people straight from academia, there is the fact that leaving academia is terrifying! Academia has its own culture and norms which people get used to. Further, within academic settings there is the notion that leaving academia means you are a failure, which in reality couldn’t be further from the truth. These norms make it extremely difficult to have a good transition out.

If you have gotten this far and feel like this is a call out post: breathe, you are not alone and this is a solvable problem. The point I am trying to make is that the transition from academia to industry is a very difficult on, but getting out of the academic trap and into data science can be done!

The first thing to realize is that the required skills to be a data scientist are lower than you think. Plenty of articles on the internet suggest that to be a data scientist you need to understand a master’s collection of algorithms, be a wizard at programming, have a deep understanding of business and so on. But in reality, most data science jobs require very little deep knowledge, but instead the ability to be adaptable and learn new things. As I discuss in my series of posts about [how I hire data scientists], when I hire I don’t require someone to have a degree in data science, instead I just want to see that they have the basic skills and can learn how to get things done.

In fact, to get a data science job in industry all you need to convince someone you are equipped to handle it. Convincing someone you can handle a job requires two things: showing you have the prerequisite knowledge and showing you have experience in doing similar work. Let’s dive into those two components in more detail.

Prerequisite knowledge for a data science job

On the programming side that means either learning R or Python. There is plenty of material on the internet on which of the two to learn, but honestly if you learn one it’s easy enough to pick up the other one later. In academia you may have been using a different scientific programming language like MATLAB, SPSS, or god help you FORTRAN, but those really aren’t substitutes. The point you are trying to make to employers is that you won’t have a major problem adapting to a new job, so learn the tools that companies use. In addition to R or Python I would learn a bit of SQL, since that is how most companies store their data.

On the techniques side, I would learn three things:

  1. How to join, filter, and aggregate tables. No matter what sort of data science you’re doing, you’ll be processing data sets from different sources. Learning the fundamental ideas of how to connect data sets together, such as doing an inner join between to tables, will be essential to being a data scientist. These concepts aren’t difficult, but they’ll end up being the majority of what you do on an hourly basis.

  2. Linear and logistic regressions. These two techniques are for understanding the relationships between data. A linear regression helps you understand a continuous variable, for example the relationship between square footage of a house and it’s location to the continuous variable of house price. A logistic regression helps you understand a binary variable, for example understanding the relationship between customer spend and the binary variable of if they made a follow-on purchase or not.

  3. **How to make a good plot. **Whatever programming language you use you’ll need to take that data and visualize it. Understand how to make a bar chart, a scatter plot, and other simple visualizations so that you can explore the data.

Those three techniques will get you quite far in data science. Other methods such as factor analysis, text analytics, and deep learning can all come later. Again, for more detail check out my [series on how I hire]

It’s not enough to show employers that you know the right techniques, you also need to show them that you can work in a corporate environment. Corporate work is unstructured compared to undergraduate and masters level studies: you aren’t given concrete tasks that you’re graded on, instead you’re given a broad assignment and need to figure out what the right approach is. Compared to PhD level research, a corporate environment is extremely structured: you have deadlines for deliverables and you can’t stall because you’re waiting to get the perfect result. So corporate work isn’t similar to anything that someone who has only been in academia has done before.

The traditional way of showing a company you can handle a corporate environment is having worked in corporate environments before. When recruiters look at resumes, the first they gravitate towards is your previous work experience. This makes sense, since the life experiences that are going to most similar to your next job is your previous job. The problem people in the academia trap have is that they don’t have any corporate experience to draw on. If you’re going to leave academia you’ll likely need something that looks at least vaguely similar to work experience.

There are three main ways you can get something that can be a substitute for work experience in the eyes of employers:

Do a side project yourself

Find a project you’re passionate about and try and do something on the side. This could be something like analyzing data from the untappd app, algorithmically generating offensive license plates, or a doing a text analysis of Jane Austen. As long as it’s a topic you are interested in and it involves some semblance of data science it should be good. By having a side project, you’ll be forced to learn how to use tools and techniques of industry. You can use online courses to learn the basics, and then figure out the rest on the way. You’ll also have to figure out how to go from an idea to an actual result you share with the world. You can put this on your resume and use it as a point of experience that you talk to employers about.

Do a data science bootcamp

There are bootcamps popping up all over that will teach you data science fundamentals over three months, Galvanize in Seattle is one example. They generally cost around $15,000 and go through an introduction to programming, different data science techniques, and end with a capstone project. This capstone will be something employers will use as validation that you can do data science.

Find an internship

There are companies out there which hire graduate students for internships, and this often ends up including post-docs or other early career academics. These internships run over the summer and you are basically a very junior employee. You’ll have a job experience on your resume, and companies will look very highly on that when hiring.

With this suggested routes, hopefully you now feel like you have ways to get out of the academic trap. While this post has contained general advice on a ways out, everyone’s situation is different. If you find yourself in this predicament, I recommend talking directly to someone who has an industry data science position. The best source of guidance is from people who have taken the path you want to take. Finally, when you do get an opportunity to leave academia, make sure you are thoughtful about how you let people know. People often react negatively, and there can be repercussions if you aren’t careful. Good luck!

If you want a ton of ways to help grow a career in data science, check out the book Emily Robinson and I wrote: *Build a Career in Data Science*. We walk you through getting the skills you need the be a data scientist, finding your first job, then rising to senior levels.

Build a Career in Data Science