Data Governance
eBook - ePub

Data Governance

Governing data for sustainable business

Alison Holt, Benoit Aubert, David Sutton, Frédéric Gelissen, Alisdair McKenzie, Geoff Clarke, Rose Pan, Ming Li, Rohan Light, Beenish Saeed, Nathalie Marcellis-Warin, Abdelaziz Khadraoui, Alison Holt, Alison Holt

Share book
  1. 150 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Data Governance

Governing data for sustainable business

Alison Holt, Benoit Aubert, David Sutton, Frédéric Gelissen, Alisdair McKenzie, Geoff Clarke, Rose Pan, Ming Li, Rohan Light, Beenish Saeed, Nathalie Marcellis-Warin, Abdelaziz Khadraoui, Alison Holt, Alison Holt

Book details
Book preview
Table of contents
Citations

About This Book

Every week brings news of an organisation that has distributed data that shouldn't have been shared, or has lost out to a competitor who is using data to drive business in an innovative way. Data is fundamentally changing the nature of businesses and organisations and the mechanisms for delivering products and services. This book is a practical guide to developing strategy and policy for data governance, in line with the developing ISO 38505 governance of data standards and best practice frameworks. It will assist an organisation wanting to become more of a data driven business by explaining how to assess the value, risks and constraints associated with collecting, using and distributing data.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Data Governance an online PDF/ePUB?
Yes, you can access Data Governance by Alison Holt, Benoit Aubert, David Sutton, Frédéric Gelissen, Alisdair McKenzie, Geoff Clarke, Rose Pan, Ming Li, Rohan Light, Beenish Saeed, Nathalie Marcellis-Warin, Abdelaziz Khadraoui, Alison Holt, Alison Holt in PDF and/or ePUB format, as well as other popular books in Informatica & Elaborazione di dati. We have over one million books available in our catalogue for you to explore.

Information

PART 1
THE REQUIREMENT TO GOVERN DATA

I asked Siri ‘What is data?’ and Siri’s response was, ‘Interesting question.’ Yes, Siri, it is an interesting question.
This first part of the book will provide some background into data and why it holds such an important role in our lives. We’ll look at the benefits of collecting and sharing data and why governing data is an essential task for all organisations.
WHAT IS DATA?
We tend to think of data in electronic form, but humans were collecting data thousands of years before computers. Although this book will focus on the governance of data in electronic form, we will start off by looking at the history of the collection of data. The governance practices applied to the collection of data in physical form (from clay tiles through knotted ropes to paper) will shed light on our approach to the governance of data held in electronic form. Although our media for storage has changed, the issues faced by our forebears will be very familiar.
History also reveals the significant advantage that can be gained from holding the right data of the right quality at the right time, where ‘right’ is a statement of fit for purpose. Whether you are looking for an advantage over your competitors, or finding a cure for a disease or looking to find patterns in physical phenomena or events, then the ‘right’ data is your friend. Putting in place a governance framework will ensure that this ‘right’ data is in the right place at the right time.
DATA GOVERNANCE OR DATA MANAGEMENT?
We often confuse the terms data governance and data management. This isn’t surprising, given that some major countries in the world do not have a term for data governance and it sort of ‘translates’ as data management. The governance of data, or data governance, covers the evaluation of what needs to be done, providing direction to make it happen and monitoring to check that the desired outcomes have been delivered. Data governance can be extended to include: the application and operationalisation of the governance of data; and the setting of policy to ensure that the desired outcomes will be met through management outputs, the establishment of controls, and controlling mechanisms to ensure that the governance requirements are met.
Data management is the administration of data and includes, among other activities, the setting up of databases, the transfer of data and the archiving of data.

1 DATA COLLECTION THROUGH THE AGES

Alison Holt

For thousands of years humans have collected, stored, reported on, made decisions with, distributed and disposed of data. Amazingly, some of these data sets are still accessible today. They reveal information about ancient civilisations that would otherwise remain a mystery. Their original purpose, however, was not to provide a journal entry or historical record, but to inform the decision makers of the day.
CENSUS DATA
The Babylonians were collecting census data over 4000 years ago to work out how much food was needed to feed the population. Their census records took the form of clay tiles, and several of these tiles are held in the British Museum. Around 1500 years later, the Egyptians and the Chinese started to collect census data. The Egyptians used their data to plan the workforce needed for the building of pyramids and for the assignment of land after the annual flooding of the Nile. The Chinese census of 2 AD collected data from a staggering 57.67 million people from 12.36 million households. Meanwhile, over in Europe, the Romans were collecting census data every five years to estimate taxes due, through a sort of early rating system. The Roman method of census collection was unusually disruptive, especially for heavily pregnant mothers married to out-of-towners. It involved every man and his family returning to his place of birth to be counted.
Skipping another thousand years, we come to the production of the Domesday Book in England, a detailed survey of land holding, wealth and population across the country to enable determination of tax, rents and military service obligations of the populace, from the lowly peasants through to the barons. Five hundred years later, we find the Incas collecting census data by knotting ropes made from alpaca or llama hair.
Finally, from the 1800s we have a number of countries around the world collecting census data on a regular basis to inform not just the taxable liability of their citizens, but to assist with the building of houses, schools and hospitals, and eventually to inform programmes for the eradication of disease.
Governance lessons from census data
Census data is generally well governed and provides some interesting insights into the successful governance of data, and the need for influential and determined data custodians. UK census data cannot be released for 100 years, and the Census Registrars General through the ages have had to fight off requests for access. In the early 1900s, a request from the sanitation authorities for access to personal information in the 1891 census was denied, citing personally identifiable information (PII) reasons and the undertaking of confidentiality at the point of data collection. I suspect the requesters would have argued that many lives could have been saved, or at least improved, through the release of the data. More recently, there were online petitions in the UK for the release of the 1921 census data to assist family historians trace their ancestors. The Census Act of 1920 made the release of this information before 2021 not just ill-advised, but illegal.
Retention of data has been a trying subject for the Census Registrars General, who have several times struggled for the preservation of records, and for adequate fire-proof and water-proof storage facilities. There have also been arguments over the appropriateness of some of the data collected, and issues with the time taken to process and analyse census data. The 1911 Census of England and Wales collected information on the fertility of women in marriage, to help understand issues with the falling birth rate amidst the need for a growing workforce to support industrial expansion, but it was 1923 before a final report could be published on the data. The time delay in receiving this information must have been a great source of frustration for the initiator of the report. Timeliness is an important factor that needs to be taken into account when considering data quality.
The automation of processing and collection of census data has been a slow process, starting with the use of punched cards in the US 1890 Census to speed up the analysis of census responses. Back in 1890, processing using punched cards was calculated to be 10 times faster than the previous manual process. Since then, electronic analysis of census data has blossomed, but putting confidence in the electronic collection of census data has been a matter of debate for many years. In preparation for the 2018 New Zealand Census of Population and Dwellings, Statistics New Zealand (StatsNZ) modernised the process for data collection; according to their website:
We designed the 2018 Census forms primarily for online completion. Our aim was for 70 percent of respondents to complete their census online. The online forms were designed to work on a range of devices, from personal computers to smartphones, and were easier to complete. The version for smartphones was a first for the census, intended to encourage young people (aged 15–24 years) to take part. The 2018 Census forms were available online and in paper, in English and te reo Māori.
(Stats.govt.nz 2019)
Of course, collecting data electronically brings a completely new set of requirements for a data governance framework for the census. These types of framework will be the focus of this book.
Census data has always been collected by people who care passionately about data and understand the importance of collecting data in a consistent way. My StatsNZ friends and colleagues involved in the census in New Zealand have taken on the role of guardians protecting a precious asset – and rightly so. Census development and data collection is the gold standard among population surveys. Being supported by legislation helps to focus the respondents on completing the questions, and picking one night in every three or five years to hold the census gives the survey a sense of awe and mystery, akin with election night and Christmas Eve. There are few other sectors and areas where the importance of collecting consistent, quality data and the value of that data is understood so well, but health is one of these.
HEALTH DATA
Without data, how can we determine the difference between an isolated incident and an epidemic? How do we know how effective vaccination, chemotherapy, specific surgical procedures and so on are unless we measure outcomes accurately across a statistically significant sample of the population? If we can’t determine what causes the spread of infection, how can we fight an epidemic? And, once we’ve worked out how infection is spread, how do we contact the people who we think could be vulnerable?
There have been a number of examples in the last few years where data scientists have worked alongside health professionals to protect populations. COVID-19 aside, the Ebola outbreak in 2014 is an example of this – a disease that initially had no antidote, no vaccination to provide protection. It was essential to quickly understand how the disease was spread, and to identify potential carriers and who they had been in contact with. Data was the key in unlocking the facts that would give health professionals and government officials an understanding of how the outbreak could be stemmed. One of the Ebola stories that stuck in my mind was the health care worker who had been caring for an infected patient and who then took a commercial flight across the US. The following day she developed a fever that resulted in her being moved into isolation, tested for and then treated for Ebola. Working out who was on the plane with her, and therefore potentially at risk, was straightforward. Working out how she got infected, and who else she had been in contact with along the way, was trickier. The incident resulted in a need to rethink the governance of data relating to disease.
Let’s look at vaccination data: the Gates Foundation has done the most amazing job of vaccinating against polio, with the aim of eradicating the disease. They work by collecting data and carrying out analysis. How can we know what is really killing children in the poorest areas of the world unless we can collect, analyse and interpret data? In New Zealand we are seeing the re-emergence of diseases that had been ‘eradicated’. How can we address the root cause of this issue without reliable data to inform us?
Data ‘demonstrating’ a link between autism and vaccination has put mothers off having their babies vaccinated. We’ve had recent measles and whooping cough epidemics. How ironic that babies should be suffering in first world countries, having been withdrawn from vaccination programmes, while third world babies are happily surviving through recently established vaccination programmes.
Similarly, data ‘demonstrating’ that a tsunami defence system would protect a length of the Japanese coastline led to deaths in the major earthquake of March 2011. People in the affluent areas, who thought they were fully protected by the tsunami defence system, had insufficient time to run for safety when the defence system was overwhelmed. The people of the poorer coastal towns that didn’t have a defence system in place ran for the hills as soon as they knew that the tsunami was coming, and survived.
Governance lessons from health data
Health data has traditionally been well governed, in the sense that, since health records were first collected, all stakeholders have understood the value of having data made accessible to them, the privacy risk of sharing data and the constraints set by legislation, local health boards and policy.
TRADITIONAL DATA-HEAVY INDUSTRIES
Certain industries are (and always have been) heavily dependent on data, and the survival of individual companies and the reputation of individual government agencies within those industries have fully depended on their ability to safely collect and store data. Examples are police forces, airlines, schools, warehouses, prisons, supermarkets and companies and government agencies involved in defence and military applications. These organisations traditionally collected information on paper and spreadsheets, but now I can check into a flight online and order my weekly supermarket shop without leaving the house; and the New Zealand police force, for example, carry iPads and iPhones on the street and work with real time information.
Back in the 1990s I ran a project with a priv...

Table of contents