Leveraging information is a skillful act, not the output of a data product.
Originally posted on Medium November 30, 2019
For data professionals it is the best of times, it is the worst of times. Interest has never been greater to incorporate the use of intelligent information, statistics and science in every aspect of humanity — from complex business decisions and cutting edge products down to our day-to-day task management, data integration is seemingly everywhere. In Yuval Harari’s Homo Deus he makes a compelling argument that “dataism” may even be the next form of religion — and when you look at how our societal obsession with the stuff has evolved, he may be on to something. It seems that in the eyes of the greater business universe, any and every application of data to any and all things is, in itself, justification of the effort to do so. In a FOMO-fueled rush, companies are pouring resources into data-related technology and headcount faster than you can say machine learning. But just as buying an expensive gym membership will not give you the body of Dwayne Johnson, heavy investment in a technology stack alone will not create a culture of data-driven decision making. For an organization to capitalize on the opportunities that data-centric decision support can afford, change needs to begin with the business users. Smart organizations rally around these changes and evolve, while not-so-smart organizations meet the trials of a shifting decision process with ill-advised choices that yield disappointing results. If the successful execution of a data initiative requires all of:
- significant resources
- cultural maturity
- organizational intelligence
it begs to ask the question: “_Is my company too dumb to be data-driven?_”
Where Most Data Initiatives Go Wrong
To understand what separates the winners from the losers in the game of becoming data-driven, it helps to first understand why so many data initiatives go wrong. When building out an organizational data strategy, most businesses usually create a roadmap that resembles this progression:
- Collect the data from different sources into a Data Warehouse
- Funnel the data into a business intelligence tool
- Take advantage of insights
The first step seems easy enough for technical implementation teams. They will select a target Data Warehouse or Data Lake along with a method for extract-loading(EL) the data, and might even do some data profiling and modeling.
The second step (loading the BI tool) is a bit more opaque. Business users will have a wide range of requests for dashboards, models, and charts; some will want every imaginable data point possible, while others will at best want exact duplications of manual reports they are generating. The project will progress, passing milestone-sounding-objectives such as “the Facebook data is done” along the way. Finally, the project will reach the MVP release! The organization will wait with bated breath for those insights to start rolling in.
And wait. Those insights will be here any day now…
But somehow, business-as-usual will continue despite the massive investment in data technology and the so-called “success” of the project. Dashboards will go unused, teams will continue to rely on shared spreadsheets and collections of disjointed 3rd party tools, and executives will keep making decisions in the dark. This happens because being data-driven isn’t about Machine Learning or Artificial Intelligence, Data Warehouses or BI Dashboards. Being data-driven is about business users learning to ask good questions, which is something no Business Intelligence tool or data product can deliver.
In the scenario above, the technical components (data warehouse, ETL process, and BI visualizations) were likely correctly implemented. But being able to analyze business trends and draw accurate conclusions from metrics is not part of the standard software engineer toolkit (nor should it need to be); as such, technical implementation teams will often build exactly what is requested by business users while making an assumption that “stakeholders know best what they need.” A commonly held falsehood is that providing data and visualizations to business users will, in itself, result in data-driven decision making. When implementing a data initiative, it is critical to understand that being a subject matter expert does not mean a person will know how to leverage data regarding that subject. Just as an expert sculptor may not have the first clue how to operate a 3D printer, a seasoned product manager may have no clue how to apply a business intelligence dashboard to drive feature improvements. Business users given access to metrics and organizational knowledge for the first time may not understand what questions they should be asking, and they will be unlikely to know which actions to take based on the answers they receive. Understanding long-term trends and statistical significance of variations requires a specific, and often absent, skillset.
Meanwhile, Business users will start to show their frustration. They were promised insights that would ruthlessly slash overhead and unlock hidden fortunes of opportunity, but all the business intelligence tool seems to deliver is a lot of complicated charts and graphs that don’t really tell them anything. What are they supposed to do with a pile of squiggly lines on a graph, or the box-and-line drawings of a candlestick chart? Where are the business revelations — The red flashing warning box on the screen instructing them to increase keyword spend, or close an ill-performing warehouse, or change tense in content headlines to boost readership? Isn’t machine learning and artificial intelligence supposed to find all these hidden patterns and neatly present them to be acted upon? The belief that data products will output actionable direction via the use of “magical” algorithms is commonly held by business users new to data-related technologies; between data product salespeople and media sensationalization, it is easy to understand how we have such gross misunderstandings around the capabilities of our data tools. The fact of the matter is that data tools do not deliver business answers, they deliver information. Turning that information into answers requires user knowledge and skill. To put it bluntly, the idea that analytics tools can autonomously ingest raw data and output actionable insights is science fiction that is more appropriate for an episode of Person of Interest than a steering committee meeting. Quick! which one tells you how to run your business, and which one monitors hyperspace?
These assumptions come together to create a painful cycle of frustration and failure. The technical implementers can’t understand why business users never seem to derive value from the work they deliver, and business users can’t understand why the technical implementers keep delivering work that isn’t actionable. Still, gaining nothing from an expensive data initiative is not actually the worst possible outcome; much more damaging is when business users arrive at and apply False insights.
False Insights: M.Python Holdings LLC.
M.Python Holdings LLC is our imaginary Real Estate development firm, and in this case study you, the reader, play the role of business intelligence implementation engineer. Let’s say the Chief People Officer submits a ticket for a change to the Employees dashboard, adding “_employee weight over the average turnover rate for their respective work location_” to the reporting tabs. Before you question if the change is feasible (“do we even have employee weight data?”) You flag the ticket based on the Why Factor; how does this request possibly deliver value? You reach out to the CPO for clarification, and the conversation goes as follows:
You: “Hello mr. Idle (the CPO), can you tell me about the decisions you are going to make based on the output of this request? Understanding those decisions will help me ensure that the changes I make will fulfill your needs.” CPO: “Sure thing! We are going to use this data to hunt down any witches that might work for us.” You: “Umm.. please elaborate.” CPO: “Well, Mr.Cleese (the CFO) didn’t show to the board meeting last week, but we did find a newt in his office. So naturally we assumed a witch turned him into a newt. We voted to find and remove all witches in the company immediately, and be data-driven about it. So we figure, witches burn and so does wood, right? And of course wood floats, and ducks also float. So if the data says we have any employees that weigh as much as a duck, they are witches!”
False insights, role-your-own-statistical formulas, and metrics that are heavily influenced by confirmation bias are all extraordinarily damaging to an organizations’ decision-making process. Furthermore, when the resulting business debacles come to a head, these tainted deliverables often pin the data and implementation teams with the role of scapegoats.
Gauges & Levers
Data-driven decision making can be boiled down to a couple of factors: what you can measure (the gauges), and what you can do about it (levers)? The first step in asking smart questions comes from a detailed assessment of what the levers are; these are things like spending channels in advertising, or technical investment decisions in a SAAS company, or wording choices for a content creator. Once a business user knows what levers they have, the next step is determining what gauges they need. This is where things get tricky. Gauges come primarily in two flavors: lag (or descriptive) and lead (or prescriptive). A lag indicator is generally the thing you care about (like monthly revenue), but can only measure in hindsight. A lead indicator has a predictive association to the lag indicator — site visits, unit sales, or page traffic behavior patterns may be lead indicators to a monthly sales lag metric. Lag measures are scoreboards that tell you if your decision making was successful. But lag measures are not inherently useful for data-driven decision making; that is what you need lead indicators for. Identifying these lead indicators is at the core of asking good questions. Business users that have a strong grasp on how manipulating each lever will impact the leading gauges will be able to leverage them and steer towards the lag measure goals — and that is data-driven decision making.
Weather vs Climate
This video of Neil deGrasse Tyson explaining the difference between weather and climate is pure gold.
science is so much better with dogs
What Neil presents quite eloquently is an example of statistical significance. This is to say that a warm week in January or a cold snap in August (here in Philly) does not mean much when calculating a climate trend. When business users determine the gauges they will need to support the levers they have, it is critical that the indicators they select are actually indicators per math and science. For example, a buyer at a retail store using the lead indicator of “Today’s Temperature” to drive a stock decision and only order winter coats for the coming year would be obviously foolish. But what if the buyer uses the average temperature over the last year? Or last year’s highest and lowest temperature dates to divide the seasons? Or the average temperature change day-to-day of every day in the last decade? Adding a little math can give a feeling of credibility to creative (yet unsound) data leveraging methods; a bad gauge can often get you into more trouble than no gauge at all.
Determining Organizational Readiness
Determining how ready a company is to become data-driven is really about determining what level of data application skills are present across the organization. This determination can be done before spending a single dollar on data warehouse/business intelligence technology and will set realistic expectations around the potential return on investment a data initiative will yield.
Lag indicators are important (remember these are the eventual measures of success), but in the context of actively driving decisions with data they are not much use. For influencing decisions the focus needs to be on lead measures. With that goal, business users are tasked to provide a brief data strategy outline for their vertical. This outline consists of 4 questions per dataset:
- What result are they influencing with this dataset?
- What do they expect the dataset to look like when it arrives?
- How they will extract answers from this dataset?
- What levers will they move to action on the results?
This can be a simple as a spreadsheet with estimated column headers and a few lines of clear documentation.
A basic example of a business user action plan
In the above answer I used more statistics than I might expect from most business users; if the plan read “calculate the ROI rate for each channel and figure out where it starts to drop” that would be valuable insight that this team will likely need help from an analyst/data scientist/demand planner to definite a correct model.
A business user that is new to data-driven decision making may argue that they can’t possibly know what data to expect or how it will be used, without first having access to the data in a business intelligence tool. Try to imagine the same logic in a different context; let’s say a factory manager requests a new $100k machine for an assembly line. If the manager is unable to describe what the machine actually does or articulate how the machine will be used to make a profit, but insists that, once installed, the machine will prove to be a great investment… would this sound ludicrous? Most data initiatives cost far more than the $100k mystery machine, and yet this happens every day.
How Smart Organizations Evolve to Data-Driven
With this collection of potential gauges, levers and results it becomes possible to determine where an organization is best (and worst) prepared to incorporate data into decision making. Smart organizations will use this information to decide which tactics are best suited to their current state (being data-driven about their quest to become data-driven, if you will). These smart tactics often include:
- Establishing clear expectations within the organization. The very first step to becoming data-driven is to cement in no uncertain terms what that means within the business. This must explicitly call out that organizational decision-makers will be given the great power of information access, and that with this power is coupled with great responsibility; decision-makers will be held accountable to learn the skills to intelligently request, interpret, and act on business information using scientifically proven methods, and re-set the bar for intelligent decision making across the org. This change needs to be communicated clearly and comprehensively, as it redefines behavioral expectations for leaders and decision-makers.
- Leveraging existing data expertise. Investigating the current state of an organization can identify teams and team members that are already highly skilled at extracting insights from information. These “anchor point” teams and team members can help to establish a tideline that other teams must rise to, and can often work cross-functionally to help them do so.
- Centralizing the data initiative. Internal competition can be a major roadblock to any strategy for becoming data-driven. For an initiative to succeed, it must have a top-down endorsement as the “single source of truth” within an organization. This removes business silos and allows insights to propagate between verticals, and ensures that both lag and lead measures are conformed across the organization.
- Focusing the organization’s limited resources where data will have the greatest impact. If the majority of the business needs help identifying and modeling lead measures, and the organization has few data scientists or analysts, a viable solution is to start with teams where the return is greatest and maintain laser-focus on those areas. This means not trying to introduce business intelligence tools across the entire organization, but iterating on the next-most-valuable application as resources become available. Data proficiency is cumulative, and as business users gain data literacy they will become more self-sufficient.
- Encouraging business users to learn how to interpret and calculate simple statistics. For an organization to be truly data-driven, every team member needs to be able to consume and correctly apply data products. Business users must grow beyond the reactionary “movement of the needle” dashboards and learn to interpret rich statistical metrics.
- Considering the option to abandon, or at least defer, becoming data-driven. Contrary to popular belief, not all organizations need to become data-driven immediately to be successful. For some, pouring resources into a data initiative may be a waste of time and energy that could be better allocated elsewhere. Smart organizations do not exclude the possibility that the data-driven decision may be to not become data-driven at all.
How Not-so-Smart Organizations Fail
- Feeding the buzzword flame. When defining an organization’s goals for a data initiative, top-down messaging can often resemble clickbait from the bottom of a business website. Leaning into the pop-culture obsession with magic algorithms and all-powerful machine learning technology is not just noise; the verbiage sets up the organization for failure and disappointment. Smart organizations set realistic expectations for what a data initiative will (and will not) make possible, while foolish organizations propagate science fiction.
- Focusing on the lag indicators. Using KPI dashboards to inform decision making is like using the end score of a football game to decide which plays you should have called. Foolish organizations will circle those final numbers all month while continuing to guess at which actions might actually influence them.
- Choosing quantity over quality. A common (and horribly unsuccessful) strategy when launching a data-driven initiative is to try and please everyone by getting as many low-quality, near-raw-state datasets out to the business user population as possible. The idea is that any data is better than no data, and future improvements can be focused on accuracy and accessibility. This is a surefire way to litter the decision-making landscape with false conclusions and conflicting truths while undermining stakeholder confidence in your data ecosystem.
- Putting legacy and seniority ahead of science. The impact of office politics on a data initiative rollout is often in stark contrast between smart and foolish organizations. As the company moves to adopt a single source of data and hold decision-makers accountable to apply analytics to their processes, there are inevitably holdouts. Some teams are resistant to change along with the organization because they see it as unnecessary effort; others may quietly refuse out of hubris, believing they have nothing to gain from applied insights or statistics. Then there are the teams that reject a centralized data initiative so they can continue to control their own metrics (and of course add a friendly slant to the data in the process). Foolish organizations make exception after exception, allowing decision making to devolve into a battle over which business logic is correct; the centralized source of truth, or the “great spreadsheet” that has been trusted for years? If an organization is to become data-driven, it must take a stand on which data it believes in — and commit to that decision without exception.
A particularly challenging situation is when the holdout happens to be a member of senior leadership. Nothing deflates an organizational rally to apply lead measures that inform actionable decisions like a C-suite executive insisting on a statistically inaccurate metric, fixating on a lag measure, or dragging up untrusted metrics from a source external to the data-driven initiative. An organization can rarely behave more intelligently than the leadership that drives it, and in these situations, a foolish outcome is hard to avoid.
Like so many business initiatives, transforming an organization to leverage data-driven decision making is really about transforming the behavior and skills of its people. Companies that understand this and invest accordingly will find that data creates a world of new business opportunities; companies that focus on technology, buzzwords and science fiction will find data to be another failed investment.