“Everyone’s a data scientist–if they have the right tools”–a well-known business publication commented on social media, referencing an article on data democratization.
This is like saying everyone is a driver if he or she has the right car. While that may technically be true, you need to learn how to drive, then you need a driver’s license (well, at least legally in most places). It still says nothing about whether you know where you are going–you need a map or good directions. Some people are bad with directions; some are simply bad drivers. You could be the world’s best driver with the best car, but without the right map and directions, you have no chance of reaching your destination. What if, one day, all GPS maps cease to exist?*
Outside of the political context, Webster defines “democratic” as “relating to, appealing to, or available to the broad masses of the people.” The comment above by the publication implies democratization of analytics is equivalent to the democratization of data plus tools. However, this is true only if you define analytics to consist strictly of tools that can replace all understanding. The democratization process is different between data and analytics; data is a tangible asset, while analytics is a set of human activities on that asset, which may or may not include tools (technically speaking, it is possible to do 100% of analytics by human power only).
Then how should analytics made available to the broad masses? You can put a car in everyone’s hands, but not everyone should drive. On the other hand, the access to the benefits of vehicles can be near universal, as even those who cannot drive a car can generally ride in one. When it comes to analytics, however, far too many people equate democratization with universal permission to drive (i.e. execution of analytics), rather than universal access to its benefits. This has led to perhaps one of the most critical problems with analytics today: your ability to carry out the analysis tasks says nothing about whether the analysis is valid, and the recent trends of actively shifting the focus away from analysis validity and directly toward technology is troubling for the future of decision-making. Not too long ago, I came across a blog touting a “Big Data Easy Button”; again, without the right analysis design, it is simply an easy button to execute the wrong analyses, and even if the analysis is correct, its benefits often still remain out of the reach of those who need the insights to make decisions.
The recent re-emergence of the p-value controversy is another case in point. For those who recall p-values from that one statistics class, less than 0.05 and you had a statistically significant result. However, it says nothing about whether the analysis itself is valid in the first place. The fact that you can apply the mechanics of the statistical analysis to obtain the p-value does not validate your conclusion, much like the fact that you can operate a vehicle says nothing about whether you can actually reach the intended destination. Unfortunately, the p-value has achieved the statistical easy button status; the fixation on p-values routinely leads to false conclusions, driving the editor of a well-known periodical to call for a ban of p-values all total. The problem is not with the tool but with the users of the tool and the context in which the tool is used, banning the tool does not eradicate bad users of statistics or fix the context, and the true benefits of statistics still never reach those that really need them.
While there is so much attention on doing analytics, many are still so far out of the reach of its benefits. People are convinced that one must drive rather than ride for analytics to be universally accessible, and many continue to wait for the ride that will never come, while bad drivers clog up the streets, not knowing where they are going and causing massive pile-ups. Show me a place where everyone with a great car is a great driver who knows exactly where he or she is going, and I will show you an analytics easy button that makes everyone a data scientist.
*P.S. What did we do before GPS?