Competing with High Quality Data
eBook - ePub

Competing with High Quality Data

Concepts, Tools, and Techniques for Building a Successful Approach to Data Quality

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Competing with High Quality Data

Concepts, Tools, and Techniques for Building a Successful Approach to Data Quality

About this book

Create a competitive advantage with data quality

Data is rapidly becoming the powerhouse of industry, but low-quality data can actually put a company at a disadvantage. To be used effectively, data must accurately reflect the real-world scenario it represents, and it must be in a form that is usable and accessible. Quality data involves asking the right questions, targeting the correct parameters, and having an effective internal management, organization, and access system. It must be relevant, complete, and correct, while falling in line with pervasive regulatory oversight programs.

Competing with High Quality Data: Concepts, Tools and Techniques for Building a Successful Approach to Data Quality takes a holistic approach to improving data quality, from collection to usage. Author Rajesh Jugulum is globally-recognized as a major voice in the data quality arena, with high-level backgrounds in international corporate finance. In the book, Jugulum provides a roadmap to data quality innovation, covering topics such as:

  • The four-phase approach to data quality control
  • Methodology that produces data sets for different aspects of a business
  • Streamlined data quality assessment and issue resolution
  • A structured, systematic, disciplined approach to effective data gathering

The book also contains real-world case studies to illustrate how companies across a broad range of sectors have employed data quality systems, whether or not they succeeded, and what lessons were learned. High-quality data increases value throughout the information supply chain, and the benefits extend to the client, employee, and shareholder. Competing with High Quality Data: Concepts, Tools and Techniques for Building a Successful Approach to Data Quality provides the information and guidance necessary to formulate and activate an effective data quality plan today.

Trusted by 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Publisher
Wiley
Year
2014
Print ISBN
9781118342329
eBook ISBN
9781118416495
Edition
1
Subtopic
Operations

Chapter 1
The Importance of Data Quality

1.0 Introduction

In this introductory chapter, we discuss the importance of data quality (DQ), understanding DQ implications, and the requirements for managing the DQ function. This chapter also sets the stage for the discussions in the other chapters of this book that focus on the building and execution of the DQ program. At the end, this chapter provides a guide to this book, with descriptions of the chapters and how they interrelate.

1.1 Understanding the Implications of Data Quality

Dr. Genichi Taguchi, who was a world-renowned quality engineering expert from Japan, emphasized and established the relationship between poor quality and overall loss. Dr. Taguchi (1987) used a quality loss function (QLF) to measure the loss associated with quality characteristics or parameters. The QLF describes the losses that a system suffers from an adjustable characteristic. According to the QLF, the loss increases as the characteristic y (such as thickness or strength) gets further from the target value (m). In other words, there is a loss associated if the quality characteristic diverges from the target. Taguchi regards this loss as a loss to society, and somebody must pay for this loss. The results of such losses include system breakdowns, company failures, company bankruptcies, and so forth. In this context, everything is considered part of society (customers, organizations, government, etc.).
Figure 1.1 shows how the loss arising from varying (on either side) from the target by Δ0 increases and is given by L(y). When y is equal to m, the loss is zero, or at the minimum. The equation for the loss function can be expressed as follows:
1.1 L(y) = k(y − m)2
images
Figure 1.1 Quality Loss Function (QLF)
where k is a factor that is expressed in dollars, based on direct costs, indirect costs, warranty costs, reputational costs, loss due to lost customers, and costs associated with rework and rejection. There are prescribed ways to determine the value of k.
The loss function is usually not symmetrical—sometimes it is steep on one side or on both sides. Deming (1960) says that the loss function need not be exact and that it is difficult to obtain the exact function. As most cost calculations are based on estimations or predictions, an approximate function is sufficient—that is, close approximation is good enough.
The concept of the loss function aptly applies in the DQ context, especially when we are measuring data quality associated with various data elements such as customer IDs, social security numbers, and account balances. Usually, the data elements are prioritized based on certain criteria, and the quality levels for data elements are measured in terms of percentages (of accuracy, completeness, etc.). The prioritized data elements are referred to as critical data elements (CDEs).
If the quality levels associated with these CDEs are not at the desired levels, then there is a greater chance of making wrong decisions, which might have adverse impacts on organizations. The adverse impacts may be in the form of losses, as previously described. Since the data quality levels are a “higher-the-better” type of characteristic (because we want to increase the percent levels), only half of Figure 1.1 is applicable when measuring loss due to poor data quality. Figure 1.2 is a better representation of this situation, showing how the loss due to variance from the target by Δ0 increases when the quality levels are lower than m and is given by L(y). In this book, the target value is also referred to as the business specification or threshold.
images
Figure 1.2 Loss Function for Data Quality Levels (Higher-the-Better Type of Characteristic)
As shown in Figure 1.2, the loss will be at minimum when y attains a level equal to m. This loss will remain at the same level even if the quality levels are greater than m. Therefore, it may be not be necessary to improve the CDE quality levels beyond m, as this improvement will not have any impact on the loss.
Losses due to poor quality can take a variety of forms (English, 2009), such as denying students entry to colleges, customer loan denial, incorrect prescription of medicines, crashing submarines, and inaccurate nutrition labeling on food products. In the financial industry context, consider a situation where a customer is denied a loan on the basis of a bad credit history because the loan application was processed using the wrong social security number. This is a good example of a data quality issue, and we can imagine how such issues can compound, resulting in huge losses to the organizations involved. The Institute of International Finance and McKinsey & Company (2011) cite one of the key factors in the global financial crisis that began in 2007 as inadequate information ­technology (IT) and data architecture to support the management of financial risk. This highlights the importance of data quality and leads us to conclude that the effect of poor data quality on the financial crisis cannot be ignored. During this crisis, many banks, investment companies, and insurance companies lost billions of dollars, causing some to go bankrupt. The impacts of these events were significant and included economic recession, millions of foreclosures, lost jobs, depletion of retirement funds, and loss of confidence in the industry and in the government.
All the aforementioned impacts can be classified into two categories, as described in Taguchi (1987): losses due to the functional variability of the process and losses due to harmful side effects. Figure 1.3 shows how all the costs in these categories add up.
images
Figure 1.3 Sources of Societal Losses
In this section, we discussed the importance of data quality and the implications of bad data. It is clear that the impact of bad data is quite significant and that it is important to manage key data resources effectively to minimize overall loss. For this reason, there is a need to establish a dedicated data management function that is responsible for ensuring high data quality levels. Section 1.2 briefly describes the establishment of such a function and its various associated roles.

1.2 The Data Management Function

In some organizations, ...

Table of contents

  1. Cover
  2. Title
  3. Copyright
  4. Dedication
  5. Foreword
  6. Prelude
  7. Preface
  8. Acknowledgments
  9. Chapter 1: The Importance of Data Quality
  10. Section I: Building a Data Quality Program
  11. Section II: Executing a Data Quality Program
  12. Appendix A
  13. Appendix B
  14. Appendix C
  15. Index of Terms and Symbols
  16. References
  17. Index
  18. End User License Agreement

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Competing with High Quality Data by Rajesh Jugulum in PDF and/or ePUB format, as well as other popular books in Betriebswirtschaft & Operations. We have over 1.5 million books available in our catalogue for you to explore.