Data Mining for Managers
eBook - ePub

Data Mining for Managers

How to Use Data (Big and Small) to Solve Business Challenges

R. Boire

Share book
  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Data Mining for Managers

How to Use Data (Big and Small) to Solve Business Challenges

R. Boire

Book details
Book preview
Table of contents
Citations

About This Book

Big Data is a growing business trend, but there little advice available on how to use it practically. Written by a data mining expert with over 30 years of experience, this book uses case studies to help marketers, brand managers and IT professionals understand how to capture and measure data for marketing purposes.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Data Mining for Managers an online PDF/ePUB?
Yes, you can access Data Mining for Managers by R. Boire in PDF and/or ePUB format, as well as other popular books in Commerce & Marketing. We have over one million books available in our catalogue for you to explore.

Information

Year
2014
ISBN
9781137406194
Subtopic
Marketing
C h a p t e r   1
Introduction
THE CHALLENGE IN WRITING A BOOK ON DATA MINING IS TO differentiate its content from the plethora of other books, articles, and seminars on the same topic. The explosive growth of data has certainly fuelled the need for more discussion on this topic, yet, minimal content has been actually devoted to the topic of data. The focus of the book is on data and its use in shifting the key levers of your business, and to ultimately understand how this drives ROI. Data mining is highly popular now because most businesses understand the value of information and of using it to make more profitable decisions. In most organizations today, data mining either is or will soon be a key business process. Since data mining is relatively new, knowledge and understanding of it is critical for its successful implementation. Certainly, consulting has grown in this field, and many people profess to be experts in data mining. Companies must realize that this is a new field and that it is difficult to find practitioners with extensive experience. As with any other discipline, the key to becoming knowledgeable and acquiring expertise is to put ideas into practice and observe the results. Although much of this practical knowledge has been acquired in the context of direct marketing, the approach and processes of data mining are similar in other areas. The key to attaining data mining excellence is in how one deals with the data. Data Mining is about data and how we derive knowledge from this information. This is the essence of data mining, as you will realize in the course of reading this book.
The Digital Impact
According to experienced practitioners, one of their greatest challenges today is optimizing the use of data mining, given the explosion of new software and technology currently available. The other significant challenge facing practitioners is the use of data mining in the web environment. For example in the marketing sector, data mining analyses can be conducted from the moment that a campaign is launched. Access to this type of information has already created the demand for tools and technology that will increase both the volume and the speed of analyses. In the past, analyses used to take six to eight weeks to deliver useful information. With web and digital data, we can gather information instantaneously. Due to this increased access to data, the learning environment flourishes where businesses can more quickly derive those meaningful insights and thereby increase their ability to make better decisions.
Services
The field of data mining is still in its infancy, and it will be very interesting to observe the impact it will have on business. Data mining’s reliance on technology suggests that the industry is quickly evolving in terms of available tools and software. This development will also place tremendous demands on practitioners. There is a growing demand for data mining practitioners, a demand that is not met in the current market and is not even being addressed by major universities.
Art and Science
Although the technology is becoming more sophisticated and complex in terms of providing additional targeting capabilities, there will always be an element of art in building data mining solutions. Understanding the data without any cogent business understanding of the results will generate conclusions and recommendations that can be very misleading.
For instance, through a data mining analysis, one retailer discovered that the sales of beer and diapers were very highly correlated. Does that really mean that customers who buy beer are most likely to also purchase diapers? Upon further investigation, it was found that these results were really due to the fact that beer and diapers were randomly placed next to each other in the store without any prior knowledge of purchase behavior. As it turned out, young fathers were shopping for diapers late at night and decided to pick up some beer along with the diapers. The important lesson here is that further investigation was warranted to truly uncover the “insight” delivered by the raw numbers. In this case, the data mining analysis was skewed simply by product distribution in the store and did not reflect consumers’ intentions.
Another example was that tenure or newer customers were suddenly more heavily predisposed towards purchasing products of a certain organization. Upon further investigation, it was revealed that this same company acquired another company with all its customers in the last year and then began promoting its products to the acquired company’s customers. The tenure of these customers from the acquired company was reported on the database as being 1 year tenure. Using lower tenure as a characteristic in selecting customers on a going forward basis for this company’s products would be flawed.
The art component of data mining also allows the analyst to better interpret results. With a solid understanding of the business, the data miner can better understand why certain results are occurring. At the very least, he or she can better investigate certain facets of the business that may illuminate the underlying reason for the results. Although data mining is number-intensive, knowledge of the business combined with the numbers is what allows the practitioner to deliver an optimal analysis.
Data mining is not solely about technology; it is about the use of technology to arrive at business solutions that optimize return on investment (ROI). This requires that organizations invest in intellectual capital as well as in technology such as software and new database systems. In fact, successful organizations will prioritize their investment on the intellectual side rather than on the technological front. The key to successful data mining is producing solutions that can truly optimize ROI at its most granular level; in most cases, this is at the customer level.
Data mining continues to evolve as a science and as an industry discipline, but to be truly successful as a practitioner of data mining, analysts must realize that there is an element of art involved. At the same time, data mining is now widely seen as contributing huge benefits to our society. For example, the ability to target any product to the right person at the right time lowers marketing costs while often enough yielding incremental revenue. These economics often translate into ROI increases of 100% or more. Ultimately, this increased ROI is about generating more revenue with lower costs. In the credit risk/fraud area, often considered the birthplace of data mining, as we will see in later chapters, analysts attempt to identify individuals who are likely to default on payments or, in the case of fraud, individuals who exhibit unusual patterns of spending. Even an improvement of only 1% in the areas of credit risk and fraud prevention due to data mining can translate into millions of reduced credit card losses that directly impact the bottom line.
Industry Perspective
Health
In hospitals and the health sector in general, health professionals analyze vast amounts of data and information to identify patients’ problems or diseases. By having access to the patient’s history and also to perhaps hundreds of other patients’ histories, health care professionals are better equipped to identify the given challenge or problem even in relatively unique situations. Access to prior patient history that encompasses vast amounts of data can often lead to a unique treatment for a given patient. Thanks to data mining, today’s medical professionals can analyze and combine vast amounts of data pertaining to patient histories to more easily provide optimum care and treatment for a given patient.
Government and Law Enforcement
Government agencies use data mining largely as a law enforcement tool. If we think of data mining as a tool for discovering knowledge, we can understand that the key to knowledge discovery is the ability to detect unique patterns in the data. In law enforcement, we all understand the notion of “detective work” as detectives personally deal with their own repositories of data. These repositories of data consist of notes in a notebook, previous experiences, and hunches. The acquisition of this type of knowledge is very time and labor intensive. Based on this knowledge, detectives then attempt to identify the unique pattern that will lead to a potential arrest. With data mining, a higher level of automation can be utilized, and vast amounts of data and information from a variety of detectives can be collected from more than one crime scene. The data is then compiled into one analytical file or database and statistically or mathematically analyzed to uncover unique patterns of behavior or events that could help to solve the crime at hand.
Besides solving a specific crime, using this data may also help to determine which areas in a city are most prone to specific crimes. In data mining, we can architect the data by looking at the information prior to a particular event, which allows us to be predictive in terms of future events. With this type of information, the police can then better allocate resources. This smarter allocation of resources means the city can send both the appropriate number and type of officers based on the type and volume of crime that most often occurs in a given area. For example, New York City did exactly this under Mayor Rudolph Giuliani in an attempt to reduce the horrific crime rates, in particular the homicide rates, that characterized the city in the late 1970s and early 1980s. New York’s homicide rate is now down to less than a third of what it was in the late 1970s. The use of data mining technology is one of the key reasons for this significant decrease.
The notion of using data mining as a law enforcement tool has certainly gained even more prominence after 9/11. For example, in response to the terrorist attacks of September 11, the US government established a department to oversee the compilation of data on individuals gathered by various government organizations and the use of that data in combating terrorism. This massive database contains information pertaining to areas such as health, insurance, and banking. Essentially, the government could have a dossier on each individual’s activity and behavior over the course of that person’s life. Aside from the obvious privacy concerns, one of the arguments against such a project is that the development of tools and solutions for preventing terrorist activity requires a large number of observations. In the case of 9/11 and of perhaps preventing future terrorist attacks the argument is that we would not be able to gather enough data to build profiles of Al-Qaeda terrorists since, we have only 19 data points (19 terrorists) to contrast with information about the remaining 300 million nonterrorists in North America.
Various debates and discussions have centered on the use of data to identify potential terrorists. Among the fallout from these discussions has been the notion of “racial profiling” and the obvious implications of this practice. With information gleaned from data mining, law enforcement organizations can focus on certain sectors of the population that are considered high-risk (i.e., racial profiling).Yet issues as sensitive as racial profiling will need to be addressed by examining how to balance the need for public security with that for protection against law enforcement discrimination. A number of key stakeholders will need to be involved in these thought-provoking debates. Ultimately, I hope that this process will result in a policy that clearly lays out the guidelines regarding the use of racial profiling in the context of law enforcement.
Real-World Experience of Data Mining
When I began my career as a direct marketer in 1983 after graduating with an MBA, I was fortunate enough to apply my academic knowledge in the areas of statistics. Like most young graduates, I was confident that I would make a difference in my new company. Yet although I contributed to the organization in fulfilling the duties of my position, the organization contributed far more to me by providing a basis on which to build my career, namely, the application of statistics in business. I learned to follow the esoteric and arcane principles of statistics but not adhere to them in the strictest sense. For example, in most cases, the traditional assumptions regarding the statistical analysis of sample groups are not followed because most analytical situations using statistics in the business environment deal with nonnormal samples. A normal sample is when half the sample lies below the mean of a certain sample characteristic (i.e., age, income, etc.) and the other half lies above the mean with the distribution of data looking like a bell curve. Yet, businesses continue to apply these techniques, much to the pure mathematician’s horror, because they work and produce acceptable results that yield significant incremental benefits. The recognition that it was acceptable to bend the rules of statistics was perhaps my most important introduction to the application of statistics in the world of direct marketing. In the world of academia, we were often used to regression results that produced R2 of .7 or more. (This topic will be discussed in more detail in later chapters.) In the world of direct marketing, it was not uncommon to observe R2 results of .05 or less, depending on the data. Statistical results that would be totally unacceptable in the academic statistical arena were commonly applied in the business world. The reason for this divergence in opinion between academia and business is that each side views the success of a given solution in very different ways. We will discuss this in more detail in later chapters.
As with any book on a topic of great significance, it is important to differentiate this book from others. If you are looking for mathematical and technological nuances and insights regarding data mining, this book is not for you. However, if you want a more practical business perspective of data mining, you have found the right book. Using this perspective, I have decided to focus on data mining from a practitioner’s viewpoint to allow you to benefit from my more than 30 years of applying a number of tools and techniques to countless business situations. I hope my insights will shed light on what works and what doesn’t work under certain conditions. As with any book on data mining, there will be discussion of the technology and mathematics involved, but the focus is on how data mining impacts a business. However, I do not intend to turn readers into data mining specialists. That is, if you already have experience with data mining, this book can offer another point of view to consider. In addition, this book provides insights into data mining in a Canadian context. Most books on data mining are written in the United States from an American perspective. Adding a Canadian perspective to the discussion will enhance overall understanding of the topic. If you find this book serves you as a useful reference regarding specific data mining tactics and their impact on a particular business problem, then I will have achieved my goal.
Enjoy.
C h a p t e r 2
Growth of Data Mining—An Historical Perspective
IN WRITING A BOOK ON ANY TOPIC, IT IS ALWAYS IMPORTANT TO understand its history. What were the challenges and developments of the past that catapulted data mining to the forefront of business today? To get a sense of data mining’s history, it is useful to speak to its early practitioners, namely, the direct marketers of the large major catalogue companies and publishing houses.
As a recent MBA graduate back in 1982, I was very fortunate to be hired by one of these pioneering direct marketing firms, Reader’s Digest. At that time, I was hired to work in the company’s list selection department as a regression analyst. Although we did develop predictive regression models for direct mail campaigns, we were responsible for all analyses or segmentations dealing with individual-level data on customers. My colleagues and I were the customer knowledge gurus at Reader’s Digest and saw the huge ROI payback of this business culture; naturally, I thought that this business process was typical of most organizations. Conversations with my MBA business colleagues who were working at leading institutions in various industries in the early eighties revealed that this type of work was not being done. Back then, we had no fancy titles, such as business intelligence specialist, knowledge discovery analyst, data scientist, or CRM business analyst, because those were the “pioneering” days of data mining and predictive analytics.
I realized the tremendous business potential of data mining in improving overall ROI and saw that Reader’s Digest was one of only a handful of organizations doing this kind of work. I realized the tremendous entrepreneurial opportunities in data mining. However, one barrier to becoming an entrepreneur in this field was technology. Computing equipment for capitalizing on these statistical techniques used in direct marketing campaigns came at tremendous costs because in 1982, there were virtually no PCs yet, and all our work was done on mainframes. Yet, the analytical environment at Reader’s Digest was very robust and efficient. We had an extensive campaign history for each customer. We knew when customers were promoted to, how often they received promotions, what types of promotion they received, and how often they had received promotions since their last purchase. With this history of promotion and purchase behavior, we were able to quickly develop robust models. We had multiple models for each product line based on custom...

Table of contents