The Practitioner's Guide to Data Quality Improvement
eBook - ePub

The Practitioner's Guide to Data Quality Improvement

  1. 432 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

The Practitioner's Guide to Data Quality Improvement

About this book

The Practitioner's Guide to Data Quality Improvement offers a comprehensive look at data quality for business and IT, encompassing people, process, and technology. It shares the fundamentals for understanding the impacts of poor data quality, and guides practitioners and managers alike in socializing, gaining sponsorship for, planning, and establishing a data quality program.It demonstrates how to institute and run a data quality program, from first thoughts and justifications to maintenance and ongoing metrics. It includes an in-depth look at the use of data quality tools, including business case templates, and tools for analysis, reporting, and strategic planning.This book is recommended for data management practitioners, including database analysts, information analysts, data administrators, data architects, enterprise architects, data warehouse engineers, and systems analysts, and their managers.- Offers a comprehensive look at data quality for business and IT, encompassing people, process, and technology.- Shows how to institute and run a data quality program, from first thoughts and justifications to maintenance and ongoing metrics.- Includes an in-depth look at the use of data quality tools, including business case templates, and tools for analysis, reporting, and strategic planning.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access The Practitioner's Guide to Data Quality Improvement by David Loshin in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Chapter 1 Business Impacts of Poor Data Quality
Chapter outline
  • 1.1 Information Value and Data Quality Improvement 3
  • 1.2 Business Expectations and Data Quality 4
  • 1.3 Qualifying Impacts 5
  • 1.4 Some Examples 7
  • 1.5 More on Impact Classification 11
  • 1.6 Business Impact Analysis 13
  • 1.7 Additional Impact Categories 14
  • 1.8 Impact Taxonomies and Iterative Refinement 15
  • 1.9 Summary: Translating Impact into Performance 16
Most organizations today depend on the use of data in two general ways. Standard business processes use data for executing transactions, as well as supporting operational activities. Business analysts review data captured as a result of day-to-day operations through reports and analysis engines as a way of identifying new opportunities for efficiency or growth. In other words, data is used to both run and improve the ways that organizations achieve their business objectives. If that is true, then there must be processes in place to ensure that data is of sufficient quality to meet the business needs. Therefore, it is of great value to any enterprise risk management program to incorporate a program that includes processes for assessing, measuring, reporting, reacting to, and controlling the risks associated with poor data quality.
Flaws in any process are bound to introduce risks to successfully achieving the objectives that drive your organization's daily activities. If the flaws are introduced in a typical manufacturing process that takes raw input and generates a single output, the risks of significant impact might be mitigated by closely controlling the quality of the process, overseeing the activities from end to end, and making sure that any imperfections can be identified as early as possible. Information, however, is an asset that is generated through numerous processes, with multiple feeds of raw data that are combined, processed, and fed out to multiple customers both inside and outside your organization. Because data is of a much more dynamic nature, created and used across the different operational and analytical applications, there are additional challenges in establishing ways to assess the risks related to data failures as well as ways to monitor conformance to business user expectations.
This uncovers a deeper question: to what extent does the introduction of flawed data impact the way that your organization does business? While it is probably easy to point to specific examples of where unexpected data led to business problems, there is bound to be real evidence of hard impacts that can be directly associated with poor quality data. Anecdotes are strong motivators in that they raise awareness of data quality as an issue, but our intention is to develop a performance management framework that helps to identify, isolate, measure, and improve the value of data within the environment. The problem is that the magnitude and challenge of correlating business impacts with data failures appear to be too large to be able to manage – thus the reliance on anecdotes to justify an investment in good data management practices.
But we can compare the job of characterizing the impacts of poor data quality to eating an elephant: it seems pretty big, but if we can carve it down into small enough chunks, it can be done one bite at a time. To be able to communicate the value of data quality improvement, it is necessary to be able to characterize the loss of value that is attributable to poor data quality.
This requires some exploration into assembling the business case, namely:
  • Reviewing the types of risks relating to the use of information,
  • Considering ways to specify data quality expectations,
  • Developing processes and tools for clarifying what data quality means,
  • Defining data validity constraints,
  • Measuring data quality, and
  • Reporting and tracking data issues,
all contributing to performance management reporting using a data quality scorecard, to support the objectives of instituting data governance and data quality control.
Many business issues can be tied, usually directly, to a situation where data quality is below user expectations. Given some basic understanding of data use, information value, and the ways that information value degrades when data does not meet quality expectations, we can explore different categories of business impacts attributable to poor information quality, and discuss ways to facilitate identification and classification of cost impacts related to poor data quality. In this chapter we look at the types of risks that are attributable to poor data quality as well as an approach to correlating business impacts to data flaws.

1.1 Information Value and Data Quality Improvement

Is information an organizational asset? Certainly, if all a company does is accumulate and store data, there is some cost associated with the ongoing management of that data – the costs of storage, maintenance, office space, support staff, and so on – and this could show up on the balance sheet as a liability. Though it is unlikely that any corporation lists its data as a line item as either an asset or a liability on its balance sheet, there is no doubt that, because of a significant dependence on data to both run and improve the business, senior managers at most organizations certainly rely on their data as much as any other asset.
We can view data as an asset, since data can be used to provide benefits to the company, it is controlled by the organization, it is the result of a sequence of transactions (either as the result of internal data creation internally or external data acquisition), it incurs costs for acquisition and management, and it is used to create value. Data is not treated as an asset, though; for example, there is no depreciation schedule for purchased data.
On the other hand, the dependence of automated operational systems on data for processing clearly shows how data is used to create value. Transaction systems that manage the daily operations enable “business as usual.” And when analytic systems are used for reporting, performance management, and discovery of new business opportunities, the value of that information is shown yet again. But it can be a challenge to assign a direct monetary value to any specific data value. For example, while a computing system may expect to see a complete record to be processed, a transaction may still be complete even in the absence of some of the data elements. Does this imply that those data elements have no value? Of course not, otherwise there would not have been an expectation for those elements to be populated in the first place.
There are different ways of looking at information value. The simplest approaches consider the cost of acquisition (i.e., the data is worth what we paid for it) or its market value (i.e., what someone is willing to pay for it). But in an environment where data is created, stored, processed, exchanged, shared, aggregated, and reused, perhaps the best approach for understanding information value is its utility – the expected value to be derived from the information.
That value can grow as a function of different aspects of the business, ranging from strictly operational to strategic. Sales transactions are necessary to complete the sales process, and therefore part of your sales revenues are related to the data used to process the transaction. Daily performa...

Table of contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Table of Contents
  5. Foreword
  6. Preface
  7. Acknowledgments
  8. About the Author
  9. 1: Business Impacts of Poor Data Quality
  10. 2: The Organizational Data Quality Program
  11. 3: Data Quality Maturity
  12. 4: Enterprise Initiative Integration
  13. 5: Developing A Business Case and A Data Quality Road Map
  14. 6: Metrics and Performance Improvement
  15. 7: Data Governance
  16. 8: Dimensions of Data Quality
  17. 9: Data Requirements Analysis
  18. 10: Metadata and Data Standards
  19. 11: Data Quality Assessment
  20. 12: Remediation and Improvement Planning
  21. 13: Data Quality Service Level Agreements
  22. 14: Data Profiling
  23. 15: Parsing and Standardization
  24. 16: Entity Identity Resolution
  25. 17: Inspection, Monitoring, Auditing, and Tracking
  26. 18: Data Enhancement
  27. 19: Master Data Management and Data Quality
  28. 20: Bringing It All Together
  29. Index