Why we are afflicted with data science degrees

April 14, 2013

A friend sent me an article about the masters and graduate certificate programs in data science springing up around the country. I think it was meant solely to stir me up. He knows me well.

We'll come back to the peculiar thing that is "data science" later. Let's look at these programs first. They're teaching basic probability and descriptive statistics, how to design a study and analyze it, how to make decent plots, linear regression in its various forms, and enough understanding of programming to get some work done. Some add on some domain knowledge on business.

That's excellent material, and the undergraduate students at University of Washington or Northwestern or Columbia or the various schools offering these programs should be screaming bloody murder, or at least demanding their tuition back. Those aren't graduate topics! Those are the basics you should expect from anyone with a technical degree! Okay, if you hired someone who studied a basic science like physics or chemistry you might not expect the business knowledge, but an engineer had better have it.

They are very much teachable to the undergraduates. A good hunk of the data science certificate gets taught to physics majors in one semester of their second or third year at University of Virginia as "Fundamentals of scientific computing". I single out University of Virginia's class as an example because I happened to be there when it started in 2005, and remember talking about what should be in it with Bob Hirosky, its creator. My friends were the teaching assistants.

And the topics in these certificates are the basics, not the advanced material. Not that there aren't legions of professional analysts out there with less statistical skill and no knowledge of programming, but no one would dream of giving them a title other than "Excel grunt"—sure, gussied up somehow to stroke their ego, but that's what it comes down to.

So, we have a failure of the academy. Nothing new there. The rise of data science itself is a peculiar one, though. It was reified into existence by a couple of guys doing the dismal work of mathematically stalking people at Facebook and LinkedIn, though Google got in on the name game pretty quickly. Who can blame them? If your job is doing something as puerile as getting people to click on ads, I don't begrudge you any vestige of self respect you may try to grasp. Yes, your life would be better spent getting your plumber's certificate and doing something constructive, but I recognize that it's hard to make big changes. But let's not pretend that anyone would let you anywhere near the census or running a clinical trial.

Once data science was reified, the fight was on for who got to decide what it was. I know a few of the conceptions:

There are more. The full study of the factions and their interplay would make a very interesting sociology or history thesis.

But the fact that we're arguing over this at all is a symptom of the failure of technical education today. That's right: if you're a professor in a technical department, it is your fault that these certificates exist. You have failed your students, and the world is paying a price in buzzwords.

