Hiring data scientists (part 2): the perfect candidate doesn't exist

What are the most to least important skills, and the types of people who apply for data science jobs

[This is part 2 of my series on hiring data scientists. If you haven’t read part 1 yet, do that first.]

In part 1, I talked about all of the skills I want in an ideal candidate. But since that candidate doesn’t exist, I have to prioritize what attributes the candidates that I hire have.

When you look at it like this, it seems reasonable.When you look at it like this, it seems reasonable.

The ability to get things done [required]

Getting things done means that when I give you something to work on you will:

This is more important than any amount of technical of business skills because it builds trust. I don’t care what someone is capable of if I can’t trust that they’ll actually do it. If they don’t know how to ask for help or are too afraid to, they will burn time waiting until I notice things aren’t getting done.

Intelligence [required]

I need someone who can realize when they have a knowledge gap, and then has the motivation and skill to fill it. If a candidate can’t notice their own shortcomings and overcome them, there will always be a limit to what they can do. I want a person who has the ability to take a task, realize it can be improved, and then teaches themselves how to do it. For instance if each week they’re required to run a weekly report, I want them to notice that someone could automate it. I then want them to have the ability to learn how to automate it themselves.

Programming and databases [important]

Programming is the most important hard skill I hire for. It’s essential for advanced modeling, working on large datasets, and interacting with APIs. Programming requires deft thinking and constant application of new tools. The ability to code doesn’t just show me the person can write a program, but it shows me they have some skill in learning, solving puzzles, and thinking critically.

Math and statistics [optional]

It feels dirty to say this, but it isn’t necessary to know math or statistics to be a good data scientist. Someone can aggregate data and make slick visualizations without knowing what a linear regression is. Most candidates who can get stuff done, are intelligent, and have programming skills can pick up math and stats as they start working, moving from exploratory data analysis to model building. Sure, they may never be able to fine tune a deep learning model, but data science encompasses so much more than that.

Business expertise [very optional]

Business expertise is the skill I am most happy to shrug off. The ability to work and communicate with others will be picked up after having spent a few years working within a company. And if they never do? That’s fine. There’s plenty of value in being a team member who builds models and cleans data without reporting out.

Data science archetypes

When you spend a good hunk of your time interviewing for data science positions, you notice archetypes emerge. Here are my highly-stereotyped classifications I’ve observed again and again, as well as some advice for what to do if you happen to fall into one of these categories.

The business analyst

This person does most of their work in Excel. They take data and manipulate it into charts, then make PowerPoints from these and present them. Maybe they pull their own data using SQL, and maybe they don’t.

The business intelligence person

This person sets up databases and creates dashboards to present business metrics and KPIs. Other people take their dashboards and use them to make decisions.

The data scientist

This person knows how to program and how to build machine learning models. If their only experience is building production models, chances are they don’t have business expertise and instead build the models that other people tell them to.

Next, the archetypes for people coming from school and looking for their first job.

Person coming out of school with a freshly-minted STEM degree

Hopefully this person has a degree in math, statistics, or CS, but as a substitute economics, physics, or engineering will suffice. They may have had an internship or two, but they haven’t worked full time before.

The academic

This person was on an academic track but is exiting it for industry. This could be from leaving partway through a PhD program to having had a post-doc.

MBA with a business analytics focus

This person got an MBA, but wanted to learn about what’s relevant these days so they got a degree that focuses in analytics. Business school taught them how to use Excel and use plug-ins to do things in it like *k-*means clustering.


So to recap: there are a ton of different people applying for data science positions, and some types of people have more of the skills I want than others. But how do I figure out what knowledge and expertise the people actually have? That will be covered in the upcoming parts of this series:

If you want a ton of ways to help grow a career in data science, check out the book Emily Robinson and I wrote: Build a Career in Data Science. We walk you through getting the skills you need the be a data scientist, finding your first job, then rising to senior levels.

Build a Career in Data Science