Why do we need usable data?
- Data usability is needed to increase the degree to which data can be used for actionable analytics. It also includes easy to use applications which would increase data interaction. This would help in creating a common data language for understanding organizational performance. We can use data dictionaries and metadata repository to improve usability. Furthermore, relevant training and access to datasets would also boost the data usability in the organization.
- Data usability is applicable in all the different functions of an enterprise – IT, Products, Finance, HR, Supply Chain, Operations and Marketing. According to a CompTIA survey, 71 percent of firms that are average or lagging in leveraging data, feel that their staff are moderately or significantly deficient in data management and analysis skills. Relevant data management and analysis training may boost the data usability in the organization.
- Another facet of data usability is to minimize errors or misinterpretation of the data. Consider this incident in “TransAlta” (a canadian power generator firm) where a simple Excel error led to the firm buying hedging contracts at a higher price than it should have. This error cost the firm a whopping $24 million. Usable data prevents such errors in analysis.
The 5 key features of usable data
1. Accuracy and Precision
Accuracy is the likelihood that the data reflects the truth. For example, consider a database that contains names, email and addresses of doctors in New York. This database is known to have a number of errors : some are wrong, missing or obsolete. Such data is considered to be of poor quality as the accuracy is low.
Precision is the depth of knowledge encoded by the data. For example resolution of images, video and audio or degree of aggregation of statistics.
How integrated is the data with other data sources and data objects? As an example, for a retailer, sales transaction data should be linked with other sources such as promotions data, product placement, store layout, staffing, seller information. The synergy between the multiple sources will lead to more valuable and actionable insights.
A field can be accurate, precise, and integrated with related information, but if it’s only found in a handful of records, its value decreases. For example, if a field is only populated five percent of the time it becomes difficult to perform data manipulation or analysis.
Data standardization is the process of bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies. For example, some basic standardization would fix issues like inconsistent capitalization, punctuation, acronyms, and values in the wrong fields.
An important attribute to uplift usability is to create metadata. It is data that provides information about other data. It summarizes basic information about data which makes finding and working with particular instances of data easier. For example, data summarizing the fields (column names), data types and their description of a SQL table would constitute a metadata.
Data usability is required to meet the needs of different end-user audiences, and is ready for the tasks the user needs to accomplish. Usable data has been cleaned, structured, can be readily processed by softwares, fully documented and ready for analysis and interpretation. Data usability is one of the fundamental steps in data governance for evangelizing data-driven mindset in an organization.