Tuesday, April 5, 2022

Why I Call Myself a Data Scientist

​This week, I am at the INFORMS Business Analytics conference, one of the two conferences I attend regularly as an Operations Research PhD and enthusiast​. In fact, INFORMS conferences ​are the only ones ​I have attended at all since graduation.​ In my work, I identify myself as a data scientist. What is interesting about these two facts is that INFORMS is not even on the map when it comes to data science.


What I find most valuable about my background in Operations Research is that by the end of your PhD for sure, and likely after a Masters, you have internalized one key lesson: The problem is always up for discussion. Unfortunately, you don't receive that lesson explicitly. Instead, what you get is a series of courses focused on "reformulating" problems. As an example, you learn that linear problems are easiest mathematically, and so you use your training to rewrite problems as linear subproblems. After spending 2-6 years rewriting problems it becomes crystal clear that the first way you think of writing down a problem is unlikely to be the best.


This mindset puts you in a place to succeed as a data scientist (and also as a consultant). Traditionally, the best data scientists are able to take a business problem and understand how to leverage ML and tools from analytics to solve that problem. Data science training programs focus on teaching you primarily how to use ML algorithms and code. This puts Operations Researchers in an odd position as they have so many of the hard-to-find skills on the business side that make exceptional data scientists, but often are lacking the ML skills that are considered "table stakes" for these roles.


As a result of all this, I call myself a data scientist. It is expedient and people generally hand me the right kinds of problems when I market myself that way. At some point they catch on or I warn them that not all data scientists approach problems the same way I do. Depending on the setting I go so far as to explain Operations Research and why they should consider hiring OR professionals for their data science needs. However, what I really wish is that the mindset of OR became the common framework for all data scientists. By articulating the notion that the problem is always up for discussion, you start to realize how much of your value comes not from just solving the problem you were asked to solve, but from getting to the why and how along the way.

No comments:

Post a Comment