Data Science for Marketing Analytics
eBook - ePub

Data Science for Marketing Analytics

Achieve your marketing goals with the data analytics power of Python

Tommy Blanchard, Debasish Behera, Pranshu Bhatnagar

Share book
  1. 420 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Data Science for Marketing Analytics

Achieve your marketing goals with the data analytics power of Python

Tommy Blanchard, Debasish Behera, Pranshu Bhatnagar

Book details
Book preview
Table of contents
Citations

About This Book

Explore new and more sophisticated tools that reduce your marketing analytics efforts and give you precise results

Key Features

  • Study new techniques for marketing analytics
  • Explore uses of machine learning to power your marketing analyses
  • Work through each stage of data analytics with the help of multiple examples and exercises

Book Description

Data Science for Marketing Analytics covers every stage of data analytics, from working with a raw dataset to segmenting a population and modeling different parts of the population based on the segments.

The book starts by teaching you how to use Python libraries, such as pandas and Matplotlib, to read data from Python, manipulate it, and create plots, using both categorical and continuous variables. Then, you'll learn how to segment a population into groups and use different clustering techniques to evaluate customer segmentation. As you make your way through the chapters, you'll explore ways to evaluate and select the best segmentation approach, and go on to create a linear regression model on customer value data to predict lifetime value. In the concluding chapters, you'll gain an understanding of regression techniques and tools for evaluating regression models, and explore ways to predict customer choice using classification algorithms. Finally, you'll apply these techniques to create a churn model for modeling customer product choices.

By the end of this book, you will be able to build your own marketing reporting and interactive dashboard solutions.

What you will learn

  • Analyze and visualize data in Python using pandas and Matplotlib
  • Study clustering techniques, such as hierarchical and k-means clustering
  • Create customer segments based on manipulated data
  • Predict customer lifetime value using linear regression
  • Use classification algorithms to understand customer choice
  • Optimize classification algorithms to extract maximal information

Who this book is for

Data Science for Marketing Analytics is designed for developers and marketing analysts looking to use new, more sophisticated tools in their marketing analytics efforts. It'll help if you have prior experience of coding in Python and knowledge of high school level mathematics. Some experience with databases, Excel, statistics, or Tableau is useful but not necessary.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Data Science for Marketing Analytics an online PDF/ePUB?
Yes, you can access Data Science for Marketing Analytics by Tommy Blanchard, Debasish Behera, Pranshu Bhatnagar in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming in Python. We have over one million books available in our catalogue for you to explore.

Information

Year
2019
ISBN
9781789952100
Edition
1

Chapter 1

Data Preparation and Cleaning

Learning Objectives

By the end of this chapter, you will be able to:
  • Create pandas DataFrames in Python
  • Read and write data into different file formats
  • Slice, aggregate, filter, and apply functions (built-in and custom) to DataFrames
  • Join DataFrames, handle missing values, and combine different data sources
This chapter covers basic data preparation and manipulation techniques in Python, which is the foundation of data science.

Introduction

The way we make decisions in today's world is changing. A very large proportion of our decisions—from choosing which movie to watch, which song to listen to, which item to buy, or which restaurant to visit—all rely upon recommendations and ratings generated by analytics. As decision makers continue to use more of such analytics to make decisions, they themselves become data points for further improvements, and as their own custom needs for decision making continue to be met, they also keep using these analytical models frequently.
The change in consumer behavior has also influenced the way companies develop strategies to target consumers. With the increased digitization of data, greater availability of data sources, and lower storage and processing costs, firms can now crunch large volumes of increasingly granular data with the help of various data science techniques and leverage it to create complex models, perform sophisticated tasks, and derive valuable consumer insights with higher accuracy. It is because of this dramatic increase in data and computing power, and the advancement in techniques to use this data through data science algorithms, that the McKinsey Global Institute calls our age the Age of Analytics.
Several industry leaders are already using data science to make better decisions and to improve their marketing analytics. Google and Amazon have been making targeted recommendations catering to the preferences of their users from their very early years. Predictive data science algorithms tasked with generating leads from marketing campaigns at Dell reportedly converted 50% of the final leads, whereas those generated through traditional methods had a conversion rate of only 17%. Price surges on Uber for non-pass holders during rush hour also reportedly had massive positive effects on the company's profits. In fact, it was recently discovered that price management initiatives based on an evaluation of customer lifetime value tended to increase business margins by 2%–7% over a 12-month period and resulted in a 200%–350% ROI in general.
Although using data science principles in marketing analytics is a proven cost-effective, efficient way for a lot of companies to observe a customer's journey and provide a more customized experience, multiple reports suggest that it is not being used to its full potential. There is a wide gap between the possible and actual usage of these techniques by firms. This book aims to bridge that gap, and covers an array of useful techniques involving everything data science can do in terms of marketing strategies and decision-making in marketing. By the end of the book, you should be able to successfully create and manage an end-to-end marketing analytics pipeline in Python, segment customers based on the data provided, predict their lifetime value, and model their decision-making behavior on your own using data science techniques.
This chapter introduces you to cleaning and preparing data—the first step in any data-centric pipeline. Raw data coming from external sources cannot generally be used directly; it needs to be structured, filtered, combined, analyzed, and observed before it can be used for any further analyses. In this chapter, we will explore how to get the right data in the right attributes, manipulate rows and columns, and apply transformations to data. This is essential because, otherwise, we will be passing incorrect data to the pipeline, thereby making it a classic example of garbage in, garbage out.

Data Models and Structured Data

When we build an analytics pipeline, the first thing that we need to do is to build a data model. A data model is an overview of the data sources that we will be using, their relationships with other data sources, where exactly the data from a specific source is going to enter the pipeline, and in what form (such as an Excel file, a database, or a JSON from an internet source). The data model for the pipeline evolves over time as data sources and processes change. A data model can contain data of the following three types:
  • Structured Data: This is also known as completely structured or well-structured data. This is the simplest way to manage information. The data is arranged in a flat tabular form with the correct value corresponding to the correct attribute. There is a unique column, known as an index, for easy and quick access to the data, and there are no duplicate columns. Data can be queried exactly through SQL queries, for example, data in relational databases, MySQL, Amazon Redshift, and so on.
  • Semi-structured data: This refers to data that may be of variable lengths and that may contain different data types (such as numerical or categorical) in the same column. Such data may be arranged in a nested or hierarchical tabular structure, but it still follows a fixed schema. There are no duplicate columns (attributes), but there may be duplicate rows (observations). Also, each row might not contain values for every attribute, that is, there may be missing values. Semi-structured data can be stored accurately in NoSQL databases, Apache Parquet files, JSON files, and so on.
  • Unstructured data: Data that is unstructured may not be tabular, and even if it is tabular, the number of attributes or columns per observation may be completely arbitrary. The same data could be represented in different ways, and the attributes might not match each other, with values leaking into other parts. Unstructured data can be stored as text files, CSV files,...

Table of contents