There is an inside joke between my husband and me–about the infomercials touting you can learn to play the piano in a flash. He (jokingly) threatens to achieve in mere four hours what took me many years of blood, sweat, and tears (I was a professionally trained classical pianist in my previous life), but for now, it remains an empty threat (thankfully).
I think it is fair to say that most reasonable people understand these programs do not turn a complete newbie into a professional pianist in only a few hours. I have always and strongly encouraged people to learn and enjoy playing the piano, as that also enables them to appreciate the work of other musicians more deeply and perhaps even collaborate under the right circumstances. However, offering the skill as a professional service for a fee is a different story, and it would be irresponsible for me to encourage that to someone who has only some cursory training.
The same is true of data science (or anything else for that matter). Learning should be encouraged so that one can appreciate it as well as understand its potential more intelligently–needed today simply to stay competitive. However, the line between intelligent appreciation and hard skills is becoming increasingly blurred, with the unsolicited “Learn Data Science in X Days/Weeks” advertisements showing up on my feeds daily. Along with the popularity of analytics democratization, is it becoming another factor that could threaten the integrity and eventual well-being of advanced analytics?
It is curious there seems to be anywhere from a tacit acceptance to enthusiastic encouragement for this in data science. To be fair, I do not believe that these short programs are put together under the presumption that they make a complete novice into a fully competent data scientist. That said, the expectations are usually not clearly articulated, and I am rather annoyed by what is essentially a marketing tactic that takes advantage of the hype. It plays very well into the rapid-results culture that often encourages shortcuts.
I recognize that not everyone in these programs is starting from scratch, and those with more adjacent background with just a few missing skills have a much better time transitioning into this much coveted discipline. There are other factors, obviously. We can also question what one means by a “data scientist” (let’s not get it started here), but it suffices to say that what a business needs from a “data scientist” runs a wide gamut, not all of which are about learning specific algorithms or programming language. However, I expect any “data scientist” to have the following hard competencies at the minimum:
- Solid understanding of probability concepts, on which any analysis design is heavily dependent regardless of the methodology (statistical or otherwise) ultimately employed.
- Solid ability to code, whatever the language. A data scientist must be comfortable getting around very messy raw data, big or small–it is the science of data, after all. The specific language is secondary, as long as its strengths and weaknesses are understood. What is more important is one’s ability to logic his or her way through a messy pile of data while programming efficiently and in a well-structured manner; one can always learn another language. (A recent comment hinted data scientists had to program in Python. Nonsense. I once coded something entirely in Base SAS just to prove a point and, of course, because I could.)
I purposely left out analysis techniques from the criteria. This is where short courses are perfectly suited–you can always learn techniques. But you need the above two first and foremost, and their development is not measured in weeks.
Can you learn data science in a flash? Like you can learn to play the piano in a flash.