Data Quality is the foundation for all data and systems activities within a corporation. Quality data is required to deliver correct information to decision-makers who will leverage it to gain a competitive advantage in the market. Assessing data quality across the enterprise or within specific business functions is critical. Organizations must identify key data elements and business rules, assess data for common defects, create rules or actions for fixing data, and create metrics to detect defects as they enter a system/database. Data is a critical corporate asset that gets synthesized into Information, which is the basis for Knowledge within your organization.
Data
- Facts about things, organized for analysis or used to reason or make decisions
- Raw material from which information is derived and is the basis for intelligent actions and decisions
Information
- Collections of usable facts or data
- Processed stored or transmitted data
- Data in context with precise definition and clear presentation
Knowledge
- Specific information about something-the sum or range of what has been discovered or learned
- Information known and in the proper context
- Value added to information by people who have experience and acumen to understand its potential
The culmination is applying knowledge by utilizing Information for Value which is corporate Wisdom. Corporate Wisdom is therefore a function of a corporation’s capacity to acquire and apply knowledge. This capacity to acquire and apply knowledge, Corporate Intelligence, is predicated upon the initial Quality of Data Assets. What are data assets? Data Assets are the data objects in an Enterprise that impact business functions. They may be segmented by business function such as:
- Customer
- Sales
- Partner
- Bill of Material
- Assets
- Installed Base
- Agreements
- Entitlement
- Financial (GL, AP, AR, etc.)
- Billing
- HR
What is information quality? Information Quality is the state where data assets have the following attributes.
- Clear definition or meaning
- Correct values
- Understandable presentation format (as represented to a knowledge worker)
What’s inherently wrong with data?
-
- Large Volumes of Data – the amount of available information collected by companies has doubled or tripled since 2002 and 10-30 percent is of poor quality (inaccurate, inconsistent, poorly formatted, entered incorrectly, etc.)
-
- Data is Dynamic – data is constantly being updated by employees, customers and third parties.
- People are Myopic About Quality – data quality is not a prime consideration in many corporations since the cost of maintenance is high and the process is difficult and unattractive.
What are the key points of data errors?
-
- Initial Data Entry-errors (wrong values) entered by employees – typos, intentional errors, poor training of workers, poor templates, etc.
-
- Decay-data becomes inaccurate over time – address, telephone, contact, asset values, etc.
-
- Data Movement-poor ETL processes (exclude data that is mistakenly identified as inaccurate, unable to mine data in source structure, data poor transformation of data, etc.) create data warehouses with more inaccurate information than the source.
- Data Use-data incorrectly applied to information objects such as spreadsheets, queries, reports, portals, etc.
What are the common sources of data corruption? 1. Data entry by employees – employees input errors to systems by mistake or intentionally to save time
- Misspellings
- Transposition of numbers
- Incorrect or missing codes
- Data placed in the wrong fields
- Unrecognizable names
- Nicknames
- Abbreviations
2. Data entry by customers
- Customers Input errors to front-end systems
- Online customers intentionally enter erroneous data to protect their privacy
3. External Data
- Third party data has inconsistencies and errors
4. Changes to internal production systems
- Changes to source systems
- Systems errors
5. Data migration or conversion projects
- Data from acquisitions and mergers where business rules do not conform
- Data from many systems in disparate formats
- Fragmentation of data definitions and business rules
What are the consequences of data quality issues?
- Inability to uniquely identify entitled versus non-entitled data hk equipment
- Incomplete or non-existent configuration data on entitled products
- Duplication and redundancy of customer and installed base data
- Inaccurate or ambiguous address and contact information related to customers
What is the cost of poor data quality to the enterprise?
- Overlooked sales opportunities
- Lost maintenance revenue
- Free service for customers
- Delays in service
- Delayed contract renewals
- Incorrect maintenance charges
- Degraded spare part logistics
Mr. DeSiena is President of Consulting Services at Bardess Group, Ltd., a Management Consulting firm specializing in data revitalization, business process design, and information technology for services-related businesses. He is currently a board member of the Society for Information Management in New Jersey.