Since Big Data became a buzzword in the board room of companies some years ago (thanks to McKinsey's report "Big Data: The next frontier for innovation, competition, and productivity"), many organizations have started Big Data initiatives in the hope of achieving its full potential. Over time, many companies have started pilot projects to address some of their most important business issues. Often, these initial steps have not shown immediate results for a number of reasons, which have been amply published online. From my experience, one of the main reasons behind the failure of Big Data projects is Data Access and Data Quality.
This is mostly true for "non-digital native" companies, and stems from the fact that such organizations never considered that the data their systems generated could be of strategic value. In other words, data was considered an "exhaust": a side-effect or a mere byproduct of running the business. While some things were done with some of this data such as descriptive business intelligence (i.e what has happened), data was never considered as a strategic asset. Normally, organizations take meticulous care of their strategic assets, and manage them explicitly, keeping a close eye on them at all times.
|Figure 2: Gartner infographic about CEOs on Data as an Asset|
When companies start their data journey, they don't often realize that their data has not been carefully taken care of or collected. It might be incomplete, duplicated, hidden, incorrect or even missing. When Data Scientists first get their hands on the data, they have many questions, and will find insights that do not make sense from a business perspective, perhaps even leading to wrong conclusions. Big Data Analytics and Machine Learning are no exception to the rule: "garbage in, garbage out".
For all of these reasons, it is important for organizations to have the right expectations when starting their data journey. We are not saying that much upfront investment needs to go into data asset management, but that organizations must be aware of the potential pitfalls in their Big Data pilots. Ideally, business leaders need to move things in parallel: starting to create value through pilots, but also starting with data management so that when you are ready to scale Data Science projects, your data is in good shape and a first-class asset.