Just One Word: Data
āI just want to say one word to you. Just one wordā¦ Are you listening? ā¦ Plastics. Thereās a great future in plastics.ā
Mr. McGuire in the 1967 movie The Graduate.
The Mr. McGuires of the world are no longer advising newly-minted graduates to get into plastics. But perhaps they should be recommending data. In todayās digital world data is the key, the ticket, and the Holy Grail all rolled into one.
I do not just mean itās growing in importance as a profession, although it is a great field to get into, and Iām thrilled that my sons Jake and Josh are pursuing careers in data and technology. Data is where the dollars are when it comes to company budgets. Every few years there is another report showing that business intelligence (BI) is at or near the top of the chief information officerās (CIO) list of priorities.
Enterprises today are driven by data, or, to be more precise, information that is gleaned from data. It sheds light on what is unknown, it reduces uncertainty, and it turns decision-making from an art to a science.
But whether itās Big Data or just plain old data, it requires a lot of work before it is actually something useful. You would not want to eat a cup of flour, but baked into a cake with butter, eggs, and sugar for the right amount of time at the right temperature it is transformed into something delicious. Likewise, raw data is unpalatable to the business person who needs it to make decisions. It is inconsistent, incomplete, outdated, unformatted, and riddled with errors. Raw data needs integration, design, modeling, architecting, and other work before it can be transformed into consumable information.
This is where you need data integration to unify and massage the data, data warehousing to store and stage it, and BI to present it to decision-makers in an understandable way. It can be a long and complicated process, but there is a path; there are guidelines and best practices. As with many things that are hard to do, there are promised shortcuts and āsilver bulletsā that you need to learn to recognize before they trip you up.
It will take a lot more than just reading this book to make your project a success, but my hope is that it will help set you on the right path.
Welcome to the Data Deluge
In the business world, knowledge is not just power. It is the lifeblood of a thriving enterprise. Knowledge comes from information, and that, in turn, comes from data. It is up to a BI team to gather and manage the data to empower the companyās business groups with the information they need to gain knowledgeāknowledge that helps them make informed decisions about every step the company takes.
Enterprises need this information to understand their operations, customers, competitors, suppliers, partners, employees, and stockholders. They need to learn about what is happening in the business, analyze their operations, react to internal and external pressures, and make decisions that will help them manage costs, grow revenues, and increase sales and profits. Forrester Research sums it up perfectly: āData is the raw material of everything firms do, but too many have been treating it like waste materialāsomething to deal with, something to report on, something that grows like bacteria in a petri dish. No more! Some say that data is the new oilābut we think that comparing data to oil is too limiting. Data is the new sun: itās limitless and touches everything firms do. Data must flow fast and rich for your organization to serve customers better than your competitors can. Firms must invest heavily in building a next-generation customer data management capability to grow revenue and profits in the age of the customer. Data is an asset that even CFOs will realize should have a line on the balance sheet right alongside property, plant, and equipmentā [1].
It can be a problem, however, when there is more data than an enterprise can handle. They collect massive amounts of data every day internally and externally as they interact with customers, partners, and suppliers. They research and track information on their competitors and the marketplace. They put tracking codes on their websites so they can learn exactly how many visitors they get and where they came from. They store and track information required by government regulations and industry initiatives. Now there is the Internet of Things (IoT), with sensors embedded in physical objects such as pacemakers, thermostats, and dog collars where they collect data. It is a deluge of data (Figure 1.1).
Data Volume, Variety, and Velocity
It is not only that enterprises accumulate data in ever-increasing volumes, the variety and velocity of data is also increasing. Although the emerging āBig Dataā databases can cause an enterpriseās ability to gather data to explode, the volume, velocity, and variety are all expanding no matter how ābigā or āsmallā the data is.
VolumeāAccording to many experts, 90% of the data in the world today was created in the last two years alone. When you hear that statistic you might think that it is coming from all the chatter on social media, but data is being generated by all manner of activities. For just one example, think about the emergence of radio frequency identification (RFID) to track products from manufacturing to purchase. It is a huge category of data that simply did not exist before. Although not all of the data gathered is significant for an enterprise, it still leaves a massive amount of data with which to deal.
VelocityāMuch of the data now is time sensitive, and there is greater pressure to decrease the time between when it is captured and when it is used for reporting. We now depend on the speed of some of this data. It is extremely helpful to receive an immediate notification from your bank, for example, when a fraudulent transaction is detected, enabling you to cancel your credit card immediately. Businesses across industry sectors are using current data when interacting with their customers, prospects, suppliers, partners, employees, and other stakeholders.
VarietyāThe sources of data continue to expand. Receiving data from disparate sources further complicates things. Unstructured data, such as audio, video, and social media, and semistructured data like XML and RSS feeds must be handled differently from traditional structured data. The CIO of the past thought phones were just for talking, not something that collected data. He also thought Twitter was something that birds did. Now that an enterprise can collect data from tweets about its products, how does it handle that data and then what does it do with it? Also, what does it do with the invaluable data that business people create in spreadsheets and Microsoft Word documents and use in decision-making? Formerly, CIOs just had to worry about collecting and analyzing data from back office applications, but now their data can come from people, machines, processes, and applications spr...