Data Should Just Be Available!
Danny Greefhorst / 10 February 2015
There is an increasing focus on data. Big Data is a concrete manifestation, but also just a tip of the iceberg. In general, there is an increasing amount of data, both within organizations as well as in the world as a whole. IT has been an important catalyst in the explosion of data; by automating processes and information flows data are increasingly born digitally.
Data has not received enough attention in the past.
I am not just talking about data with extreme volumes, variety or velocity. I am talking about the data that organizations already have, but which they are not sufficiently aware of. Organizations are generally organized into silos and data is often concealed within departments. The applications that manage these data are only accessible to a limited number of people. Not to mention data sharing across organizational boundaries.
The problem does not only apply to structured data. A lot of the data are locked in documents. Just try to find something in the increasingly expanding sea of digital records. Having a document management system is good. In practice, the compartmentalization of the organization is also visible in such a system. Each department has its own ‘site‘, and even if you can find it, you probably cannot access it because you do not have the proper authorizations.
We should, therefore, take a fresh look at data, both within the organization and across its boundaries. Let’s start by making clear what data is available inside and outside the organization. Based on that, we should have a dialogue about what the value of these datasets is and what opportunities it generates. Data is often valuable in places where it is not created. Data can help to streamline existing processes, but can also provide new insights.
When it is clear which data is more broadly applicable than it deserves and requires special care. Investments are needed to ensure the discoverability, understandability, availability, and reliability of the data. This means that responsibilities should be appointed, processes are defined and systems made available to disclose the data. When valuable data is available outside the organization, the usage of it should be made possible and necessary agreements with data providers should be made.
Explicitly exposing hitherto hidden data is good, but it is also running after the facts. It would be much better if data is discoverable, understandable, available and reliable by default. These are ideas are also present in concepts such as open data and linked data. Organizations should, therefore, embrace these concepts and their implications. Instead of thinking in terms of processes, applications, and services the data themselves should be central. This would create a whole new world of opportunities.