Solution Partner Views
Big Data: Restructuring the Information Factory
It might be an alert about a hurricane that unexpectedly flashes on a smartphone`s screen. Or the fact that a travel website finds that searches were made for flights to Chicago a week before, and is now suggesting bargain airfares. Both are examples of how big data has been shaking up digital commerce and communications with large-scale changes for consumers.
In terms of companies and industries, it`s often said that big data can lead to similarly monumental breakthroughs, revealing the kind of insights that sometimes yield solutions. So it should be a relatively straightforward step to ramp up current data capture and handling techniques for a brighter and “bigger” future—or shouldn`t it?
Where does big data come from, and how can it fit into an established, but potentially outdated, governance program? Bigger may certainly be better, although it may not always be easier, at least at the start. Unlike traditional data used by companies, big data is often created outside the internal “information factory” of an organization.
Many times it may have been gathered through untraditional sources or purchased from third-party providers. That external data likely was created for a different purpose, and its providers may not have maintained the same rigor or formal data requirements needed by the acquiring organization. It then becomes incumbent on the new owner to ensure that appropriate data management practices are applied to new data.
So it should follow that, for an organization to reap the greatest value from any big data initiative, it`s essential to implement some basic governance and establish sound policies for this new classification of data. According to a “Big Data in Insurance” study by Strategy Meets Action in 2014, the number of insurers investing in big data projects has more than doubled in the last year—from 10 percent in 2013 to 25 percent today. Even so, and within most companies, many data governance processes and procedures were typically implemented long before the advent of big data.
Oftentimes, only data that made it into a centralized repository fell under the auspices of traditional governance. Or, in some cases, only data used for internal management reporting and analytics or for external regulatory reporting was subject to such controls. Today, with data being generally recognized as a corporate asset and a distinct competitive advantage, it may be time to take another look at governance to consider how big data resources should best be handled. To begin, it`s helpful to define some of the basic vocabulary of the data world.
Big data is generally described as having the following characteristics: variety, volume, velocity, veracity, and value. Variety can be attributed to the structure of the data. Although traditional data tends to be highly structured, big data can be unstructured or even multistructured. Unstructured data includes such sources as text and notes or pictures; multistructured data often has some fragment of structure within the unstructured format. This type can include web data or telemetric, or “machine” data.
The second “V” in the series, volume, refers to the amount of data present. Next, velocity is the speed at which the data comes into or flows out of an organization, as well as the rate at which the information changes. Types of big data subject to the velocity component include information from the web, social media, and telemetry, which can generate millions of bytes in additional data within a few seconds. Veracity refers to the truthfulness or reliability of the data within the context in which it was created. And finally, value is the worth of the data in terms of business insight and new discoveries.
Given those definitions, there are some practices that should be considered when beginning to navigate sets of big data. From the start, companies should acknowledge that a governance process for big data will likely be different from that used for traditional sources of information. Updates and revisions to business and information technology strategies must regard big data as a distinct asset. Doing so will help ensure this classification is included in the strategic data plans within an organization.
It`s also helpful to create and update examples or “use cases” for the different types of big data that an organization intends to leverage. Include in those scenarios the business enhancements or improvements the organization might make in the future—and don`t forget to assign a business value to those enhancements, taking into consideration the ease (or difficulty) of retrieving and using this data. Those steps will help prioritize initiatives and speed the acquisition of relevant big data.
It`s critical to understand that the same rigor employed in traditional data management will likely not be appropriate for certain types of big data. Data quality requirements, for example, will probably become less stringent, and the standard “fit for use” paradigm may no longer apply. Since much of big data is currently used to discover patterns or additional insights into consumer behavior, it`s probably not as critical as financial accuracy. Many data governance programs assign different levels of standards or controls over data depending on its business use.
Leaders within organizations should think beyond the current information architecture. Conventional data warehouses may not be suitable for big data projects. New technologies and business intelligence tools may be needed to perform discovery analytics and gain insights, but human oversight should never be ignored.
To that end, it becomes critical to assign stewards to curate the data and to appoint custodians to oversee different types of big data that may be acquired. Following this model, clickstream or website data would have a different steward than telemetric or machine data. Stewards and owners should have a clear understanding of how the data was acquired (whether in-house capture or purchase) and what the initial purpose of this data should be.
Projects involving big data can bring great insights and value to an organization. To ensure that data is accessible, consider how it can be distributed to business units without having to move or extract the data for each project. Without a doubt, this is an exciting time to be working with data and analytics. It can be rewarding for an organization to launch big data practices while continuing to manage standard data. Just think big—but start small.