eBook - ePub

Data Science for Marketing Analytics

Name: Data Science for Marketing Analytics
Author: Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali

A practical guide to forming a killer marketing strategy through data analysis with Python, 2nd Edition

Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali

Share book

636 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Data Science for Marketing Analytics

A practical guide to forming a killer marketing strategy through data analysis with Python, 2nd Edition

Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali

Book details

Book preview

Table of contents

Citations

About This Book

Turbocharge your marketing plans by making the leap from simple descriptive statistics in Excel to sophisticated predictive analytics with the Python programming language

Key Features

Use data analytics and machine learning in a sales and marketing context
Gain insights from data to make better business decisions
Build your experience and confidence with realistic hands-on practice

Book Description

Unleash the power of data to reach your marketing goals with this practical guide to data science for business.

This book will help you get started on your journey to becoming a master of marketing analytics with Python. You'll work with relevant datasets and build your practical skills by tackling engaging exercises and activities that simulate real-world market analysis projects.

You'll learn to think like a data scientist, build your problem-solving skills, and discover how to look at data in new ways to deliver business insights and make intelligent data-driven decisions.

As well as learning how to clean, explore, and visualize data, you'll implement machine learning algorithms and build models to make predictions. As you work through the book, you'll use Python tools to analyze sales, visualize advertising data, predict revenue, address customer churn, and implement customer segmentation to understand behavior.

By the end of this book, you'll have the knowledge, skills, and confidence to implement data science and machine learning techniques to better understand your marketing data and improve your decision-making.

What you will learn

Load, clean, and explore sales and marketing data using pandas
Form and test hypotheses using real data sets and analytics tools
Visualize patterns in customer behavior using Matplotlib
Use advanced machine learning models like random forest and SVM
Use various unsupervised learning algorithms for customer segmentation
Use supervised learning techniques for sales prediction
Evaluate and compare different models to get the best outcomes
Optimize models with hyperparameter tuning and SMOTE

Who this book is for

This marketing book is for anyone who wants to learn how to use Python for cutting-edge marketing analytics. Whether you're a developer who wants to move into marketing, or a marketing analyst who wants to learn more sophisticated tools and techniques, this book will get you on the right path.

Basic prior knowledge of Python and experience working with data will help you access this book more easily.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Data Science for Marketing Analytics an online PDF/ePUB?

Yes, you can access Data Science for Marketing Analytics by Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming in Python. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2021

ISBN

9781800563889

Edition

Topic

Computer Science

Subtopic

Programming in Python

Index

Computer Science

1. Data Preparation and Cleaning

Overview

In this chapter, you'll learn the skills required to process and clean data to effectively ready it for further analysis. Using the pandas library in Python, you will learn how to read and import data from various file formats, including JSON and CSV, into a DataFrame. You'll then learn how to perform slicing, aggregation, and filtering on DataFrames. By the end of the chapter, you will consolidate your data cleaning skills by learning how to join DataFrames, handle missing values, and even combine data from various sources.

Introduction

"Since you liked this artist, you'll also like their new album," "Customers who bought bread also bought butter," and "1,000 people near you have also ordered this item." Every day, recommendations like these influence customers' shopping decisions, helping them discover new products. Such recommendations are possible thanks to data science techniques that leverage data to create complex models, perform sophisticated tasks, and derive valuable customer insights with great precision. While the use of data science principles in marketing analytics is a proven, cost-effective, and efficient strategy, many companies are still not using these techniques to their full potential. There is a wide gap between the possible and actual usage of these techniques.

This book is designed to teach you skills that will help you contribute toward bridging that gap. It covers a wide range of useful techniques that will allow you to leverage everything data science can do in terms of strategies and decision-making in the marketing domain. By the end of the book, you should be able to successfully create and manage an end-to-end marketing analytics solution in Python, segment customers based on the data provided, predict their lifetime value, and model their decision-making behavior using data science techniques.

You will start your journey by first learning how to clean and prepare data. Raw data from external sources cannot be used directly; it needs to be analyzed, structured, and filtered before it can be used any further. In this chapter, you will learn how to manipulate rows and columns and apply transformations to data to ensure you have the right data with the right attributes. This is an essential skill in a data analyst's arsenal because, otherwise, the outcome of your analysis will be based on incorrect data, thereby making it a classic example of garbage in, garbage out. But before you start working with the data, it is important to understand its nature - in other words, the different types of data you'll be working with.

Data Models and Structured Data

When you build an analytical solution, the first thing that you need to do is to build a data model. A data model is an overview of the data sources that you will be using, their relationships with other data sources, where exactly the data from a specific source is going to be fetched, and in what form (such as an Excel file, a database, or a JSON from an internet source).

Note

Keep in mind that the data model evolves as data sources and processes change.

A data model can contain data of the following three types:

Structured Data: Also known as completely structured or well-structured data, this is the simplest way to manage information. The data is arranged in a flat tabular form with the correct value corresponding to the correct attribute. There is a unique column, known as an index, for easy and quick access to the data, and there are no duplicate columns. For example, in Figure 1.1, employee_id is the unique column. Using the data in this column, you can run SQL queries and quickly access data at a specific row and column in the dataset easily. Furthermore, there are no empty rows, missing entries, or duplicate columns, thereby making this dataset quite easy to work with. What makes structured data most ubiquitous and easy to analyze is that it is stored in a standardized tabular format that makes adding, updating, deleting, and updating entries easy and programmable. With structured data, you may not have to put in much effort during the data preparation and cleaning stage.
Data stored in relational databases such as MySQL, Amazon Redshift, and more are examples of structured data:

Figure 1.1: Data in a MySQL table

Semi-structured data: You will not find semi-structured data to be stored in a strict, tabular hierarchy as you saw in Figure 1.1. However, it will still have its own hierarchies that group its elements and establish a relationship between them. For example, metadata of a song may include information about the cover art, the artist, song length, and even the lyrics. You can search for the artist's name and find the song you want. Such data does not have a fixed hierarchy mapping the unique column with rows in an expected format, and yet you can find the information you need.
Another example of semi-structured data is a JSON file. JSON files are self-describing and can be understood easily. In Figure 1.2, you can see a JSON file that contains personally identifiable information of Jack Jones.
Semi-structured data can be stored accurately in NoSQL databases.

Figure 1.2: Data in a JSON file

Unstructured data: Unstructured data may not be tabular, and even if it is tabular, the number of attributes or columns per observation may be completely arbitrary. The same data could be represented in different ways, and the attributes might not match each other, with values leaking into other parts.
For example, think of reviews of various products stored in rows of an Excel sheet or a dump of the latest tweets of a company's Twitter profile. We can only search for specific keywords in that data, but we cannot store it in a relational database, nor will we be able to establish a concrete hierarchy between different elements or rows. Unstructured data can be stored as text files, CSV files, Excel files, images, and audio clips.

Marketing data, traditionally, comprises all three aforementioned data types. Initially, most data points originate from different data sources. This results in different implications, such as the values of a field could be of different lengths, the value for one field would not match that of other fields because of different field names, and some rows might have missing values for some of the fields.

You'll soon learn how to effectively tackle such problems with your data using Python. The following diagram illustrates what a data model for marketing analytics looks like. The data model comprises all kinds of data: structured data such as databases (top), semi-structured data such as JSON (middle), and unstructured data such as Excel files (bottom):

Figur...