In a recent Dilbert strip (April 24, 2015), Dilbert declares: “I found the root cause of our problems… It’s people. They’re buggy.” The thing is, even with all the hype around analytics, this is so true that it almost isn’t funny.
There are two key disconnects in many analytics initiatives today: the business feeling that the data scientists do not understand the business needs, and the data scientists feeling like the business is not listening to what the data have to say. And the gap is not closing nearly as fast as we would like.
What we have is a failure to communicate.
Many data scientists have been hired, and many great algorithms have been developed, many of which have died a lonely death without ever being appreciated. In less dire situations, some analytics is being consumed, but it is far from optimal. What we all need to realize is that analytics is really just a methodology, and a data scientist is just a person with the appropriate skills to apply the said methodology. In reality, however, the business and the data scientists often have practical expectations from each other that are not very well articulated, resulting in friction that leads to distrust and/or indifference. With that said, I have a message for the business leaders and another for the data scientists.
To the business leaders: If great analytics happens in the forest, and no one is there to make decisions with it, does it make a business impact? The answer is no. Hiring data scientists does not make analytics happen. In fact, hiring data scientists is not one of the first things I would recommend to any organization starting out with analytics. Analytics is not just-add-water—there must be a culture and an ecosystem along with the right processes and functions in place to make it all work. Unless the organization was built data-driven from the ground up, building an analytics capability will always involve a degree of retrofitting. Until the people are ready to receive analytics, it will not be received.
To the data scientists: Building the best model is your job, but you must keep in mind that it is not the end objective–it is simply a means to the end. You have a specific skill set for which people are willing to pay, and your objective is to help people do better at whatever they are trying to do better. There is always a person at the end, and often in between; care for all people involved, and build a positive relationship with all of them. Build models for others–whether you like it or not, being an expert holder of a skill set means that you have a responsibility as a consultant in some form and thus a responsibility to manage your relationship with the end client, with very few exceptions. Without people ready to embrace your work, even your most beautiful algorithms will sit idle. Strive to connect with the people involved, and you will find that not only you will have an entirely different relationship with your client, but also you will approach the analysis differently.
Being successful in analytics, whether you are a business executive or a data scientist, is not at all about the capability to do analytics. It is about people working together and relating to each other. You have a much better chance of success by doing basic analytics with the right people, processes, functions, and culture, than by doing great analytics without them.
P.S. I’ve used the term “data scientist” here for mere convenience. What it should really be called is another discussion!
Mark Twain wrote: “Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: ‘There are three kinds of lies: lies, damned lies, and statistics.'”*
I am OK with this statement because I get the intent, and I hope many would agree that the entire scientific discipline of statistics is not a lie. Like any good statistician, I must insist on a significant evidence to the contrary before rejecting the null hypothesis that at least some of statistics is honest, and in the absence of such evidence, I am not quite ready to concede that statistics is all a big lie.
Webster defines lying as making “an untrue statement with intent to deceive.” Lack of competence does not make one a liar, so not knowing how to use statistics correctly is a different issue. The key to lying is the “intent to deceive,” and this can be in the form of unwillingness to face the reality. This past week I heard multiple references to anecdotes of someone’s desire to make the results look “not so bad”; it can also go the other way to make someone else look “not so good.” It is not that the numbers are easy to manipulate, but rather that it is easy to appear data-driven.
Back when I taught introductory statistics courses, the syllabus always included the topic of subjectivity and the impact it may have on how the results are conveyed. We looked at various mass-circulation articles, identifying the author and/or the sponsor of the piece, the potential biases and their potential impact on the conclusions. While the results may be perfectly valid in one sense, it is important to take an objective view in order to understand what is really going on. The same is true in business settings.
The assumptions are critical–especially the business assumptions, which may be called business contexts or caveats that may or may not be made explicit. Statistical assumptions are important for sure; however, in practice, the violations of contextual assumptions are far more impactful than the violations of the statistical assumptions–many methodologies are fairly robust against violations of statistical assumptions and can generally produce directionally correct results. One may choose only the results that support one’s cause and ignore others that are more important, or choose the methodology or display that allows one’s story to be told, or choose to analyze in such a way that the results would only justify one’s position. Selecting the data to fit one’s pre-formed story, rather than letting the data coalesce into a story, is the opposite of being data-driven–call it agenda-driven analytics.
Agenda-driven analytics will tell you only what one wants to hear, not necessarily what one needs to hear. And in this case, analytics will never have a chance to do what it can do–it will be an involuntary participant in the advancement of an agenda it doesn’t even support. In the meanwhile, others, including the customers, suffer from lack of better treatment; depending on the context, the consequences may be quite grave.
P.S. I should fully expect a flurry of hate mails from my esteemed statistical colleagues for saying that statistical assumptions are not very important!
*”Chapters from my Autobiography–XX,” North American Review no. 618, July 5, 1907.
It is said that some astounding proportion of BI and analytics efforts fail. Depending on the context, that number appears to range from 50% to 80%. Certainly, numerous debriefs have been conducted on what worked and what did not work; many have opined on the top reasons for failure. So, why does this continue to happen?
Take analytical pilots. (Here we refer to pilots whose main concern involves analyzing the data and not simply implementing a tool–the latter deserves a separate discussion.) Pilots are particularly important, because the resulting decisions shape the course of what to come. At the risk of stating the obvious, organizations conduct them to do something with little to no precedent, and pilots are a financially prudent way to see if it works before investing in a larger-scale capability; starting small does provide an opportunity to work out the kinks, while also allowing organizations to plan better.
However, the fundamental reasons for the start-small approach deserve more careful thought in analytics. Specifically, we should ask whether the organization is ready for the consequences of the analysis results, and recognize that the transformation expected from the positive conclusion does not happen naturally. A pilot intended to prove the value of analytics is especially tricky, as the very need to prove may indicate that someone in the organization has not yet bought into the idea of the consequences–applying analysis results to make changes, and changes are uncomfortable. There is merit to convincing the unconvinced, but the degree to which the entire organization becomes convinced is a huge factor in whether there is a realistic future for analytics beyond the pilot.
That is, the organization must have the collective desire to be data-driven and have the next steps already defined, ready to accept change. The main goal of the analysis should be simply to prove the sufficiency of the business impact, with everything else already in place or ready to be executed immediately upon the completion of the analysis. Unfortunately, many non-financial planning and decisions are put on hold until the results of the analysis are available; the situation is exasperated with the unconvinced or the marginally convinced. We have seen pilot analysis executed, only to be followed by lack of priority, a long time to define the next steps, and finally the demise.
But a data-driven culture does not deprioritize the conversion of data-driven efforts into business results. First, if for some reason a well-selected and well-executed analytical pilot end up with less-than-favorable results, it has others in the wing waiting to benefit from the learning. Second and more importantly, for a pilot to be effective, those who will consume its results must be willing, able, and empowered to do so immediately–accept and action on those results to change themselves. The real challenge with analytics is that, without the resulting operational or strategic change–i.e., non-analytical change–it has no business value. And pilots cannot succeed with no business value. A data-driven culture is all about establishing an ecosystem of consumption of analytics throughout the organization and less about acquiring tools and data scientists. Having experience and capabilities in analytics is not a prerequisite.
I am not ruling out the possibility that there exist organizations for whom analytics makes no sense whatsoever, but I have yet to come across one in my nearly two decades of looking at analytics and analytical practices. I have, however, seen plenty that were not ready to consume. I am willing to bet that some analytics is sufficiently positively impactful to well more than the 20-50% of the opportunities as suggested by the failure rate. I am also willing to bet that a substantial portion, if not the majority, of the failures never came close to implementing the non-analytical changes needed to understand the business value.
Are you going to be content with continuing the trend of failure, or are you going to challenge your business to transform?
It is not uncommon to hear business leaders say how predictive analytics is important and strategic. However, is predictive analytics really the Holy Grail of analytical maturity?
We can start by clarifying what predictive analytics is and where it resides in relation to the business objectives for leveraging analytics. We can slice the analytics space along the following three dimensions:
- Predictive vs. Explanatory: Is the primary objective to quantify the likelihood of a particular outcome, or explain a particular phenomenon or behavior?
- Exploratory vs. Confirmatory: Is the primary objective to discover something new that can help you form a hypothesis, or to confirm the hypothesis you have already formed?
- Strategic vs. Tactical: Is the goal to inform business strategy decisions or to inform and execute on a specific set of actions?
We can table the discussion on methodology—from the business perspective, the specific quantitative methodology, statistical or otherwise, is secondary. Predictive methodologies can be applied while the objectives of the analysis remain explanatory, and in practice this is rather common. We should also acknowledge that some combinations of the above do not really exist, at least theoretically, and the distinctions can get a little blurry sometimes.
The point is that predictive analytics is just one class of just one dimension that defines the business objectives for analytics. The concern is that a blind focus on the “predictive” could be boxing organizations into analytical activities that do not necessarily address the most impactful business needs.
Going back to the strategic importance of predictive analytics, I believe it is important to make the following distinction: that having access to the predictive analytics capability is certainly strategic to the business, but what predictive analytics accomplishes is almost always tactical. The results of predictive analytics (scores, alerts, etc.) are most commonly used to automate certain aspects of decision making, such as recommending the next movie to watch, rank-ordering or prioritizing customers to target, or making decisions on a large volume of credit card applications very rapidly.
Businesses must start with the business objectives, then leverage the right analytical approach for the business objectives in order to realize full potential of data-driven decision making. While predictive analytics capabilities indeed often indicate a level of analytical maturity, it is only one part of analytics maturity. Setting any specific type of analytics as the Holy Grail cheats the organization of the best impact analytics could have. And it would be a shame if business leaders became disillusioned with analytics because that specific type of analytics did not produce the aggregate business impact they were expecting.
Over the last year or so:
- A number of articles have already declared the death of Big Data.
- A question was posed on a site frequented by data science professionals whether businesses should embrace Big Data.
- A meme was immortalized: “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it” (Credit Dan Ariely of Duke University). Data science, whatever it means, continues to be one of the hottest professions.
While some have already written the eulogy for Big Data, others still struggle to grasp what it is. Many of the definitions are more conceptual than tangible, perhaps leading to equating Big Data with certain sets of technology items if only to put some boundaries around the Big Data concept. And perhaps it is this lack of tangibility that lends Big Data to be concurrently huggable, sexy, and dead.
The human nature likes things to be tangible. If we recall the introductory statistics course that everyone had to take, a sample size of 30 made it large, and a p-value of 0.05 made the results statistically significant; hopefully it has been pointed out that nothing magical happens at these thresholds. It is also important to remember that Big Data is not always unstructured, and structured data is not always small—the size and the form are two different things. That said, the size of data does eventually imply tangible technology impact, which is easier to talk about.
What about the business impact? Business problems don’t care how big the data is that solves them. Data of any size is no good until someone makes some sense out of it and uses it to effect a positive change in the business. “Doing” Big Data does not directly lead to solved business problems, yet so much of the focus is still on the size and the form and less on why “doing it” is essential in the first place. The successes come from the ability to leverage the right data to solve a business problem and effect change; starting with the size or form of the data and not with the business problem is the proverbial hammer looking for a nail to hit.
So perhaps the question is whether businesses should embrace a data-driven culture. Most people would probably answer yes. Now the difficult part—this means that it is a shift in business culture. Along with this culture comes the realization that size does not matter and that it’s what you do with it that counts.