Sport Analytics
eBook - ePub

Sport Analytics

A data-driven approach to sport business and management

Gil Fried, Ceyda Mumcu, Gil Fried, Ceyda Mumcu

Share book
256 pages
ePUB (mobile friendly)
Available on iOS & Android
eBook - ePub

Sport Analytics

A data-driven approach to sport business and management

Gil Fried, Ceyda Mumcu, Gil Fried, Ceyda Mumcu

Book details
Book preview
Table of contents

About This Book

The increasing availability of data has transformed the way sports are played, promoted and managed. This is the first textbook to explain how the big data revolution is having a profound influence across the sport industry, demonstrating how sport managers and business professionals can use analytical techniques to improve their professional practice.

While other sports analytics books have focused on player performance data, this book shows how analytics can be applied to every functional area of sport business, from marketing and event management to finance and legal services. Drawing on research that spans the entire sport industry, it explains how data is influencing the most important decisions, from ticket sales and human resources to risk management and facility operations. Each chapter contains real world examples, industry profiles and extended case studies which are complimented by a companion website full of useful learning resources.

Sport Analytics: A data-driven approach to sport business and management is an essential text for all sport management students and an invaluable reference for any sport management professional involved in operational research.

Frequently asked questions
How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Sport Analytics an online PDF/ePUB?
Yes, you can access Sport Analytics by Gil Fried, Ceyda Mumcu, Gil Fried, Ceyda Mumcu in PDF and/or ePUB format, as well as other popular books in Economics & Statistics for Business & Economics. We have over one million books available in our catalogue for you to explore.



Data 101


Ceyda Mumcu


In the introduction section of this textbook, you read about how data are used in different situations, what data might be and how we come across data, and many examples from obesity to competitive eating to ticket sales and pricing. In this chapter, you will be introduced to more technical concepts, such as analytics, data, data types, some key concepts and various statistical analyses to develop a baseline, before moving to the chapters covering application of analytics in functional areas of sport. Please remember that this chapter is not developed to replace a statistics textbook. It is, rather, a brief summary of some relevant statistical concepts and key analyses. If needed, please refer back to statistics textbooks for more detailed information.

Analytics and its importance in the sport industry

Every organization would benefit from executing their business with efficiency, and sport organizations are no exception. Due to the saturated market, it is especially important for sport organizations to function with maximum efficiency and to make smart business decisions. Today, business decisions are not done with hunches; but they are based on analytics. Davenport and Harris (2007) defined analytics as “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact based management to derive decisions and actions” (p.7).
The sport organizations with the analytical mindset generate and collect data through various internal and external sources, and analyze business performance to derive insights and make fact-based decisions to create competitive advantage, and increase the effectiveness and efficiency of the organization. Sport organizations could utilize analytics in certain functional areas or organization-wide. Most commonly, sport organizations use analytics to:
analyze athlete performance to make decisions on the starting line-up, game plans and which players to sign/draft/trade;
predict and prevent player injuries;
assess value of athletes to their brand;
examine effectiveness of various marketing activities;
segment existing fans and estimate their value;
predict retention of fans; and
develop an incident prevention model based on past incidents.
While the benefit of analytics to a sport organization is obvious, what “data” is might not be so clear to all. Let’s turn our focus to understanding what data is and some important concepts about it.

What is data?

Data is information in a variety of forms such as numbers, words, pictures, video, measurements, observations, and so on. It can be raw and unorganized, or also transformed into a format that is useable. Today, sport organizations have access to a vast amount of data including transactional data (e.g. sales, cost and inventory), non-operational data (e.g. industry sales, macroeconomic data) and meta-data (e.g. data definitions) (Frand, n.d.). In order to achieve benefits from analytics and make good decisions, organizations should begin the process by asking questions about data before jumping into data collection, and should utilize systematically assembled data (Davenport & Harris, 2007):

1 Data relevance – what data is needed?

This will be dependent on the objectives of the organization and the questions they want to answer in the pursuit of achieving their objectives. Every organization sets business objectives to gain a competitive advantage in the market place, and these objectives are based on the current status of the sport property in the market, what they want to accomplish, and their resources and competencies. Based on these components, organizations set functional objectives and identify operational metrics to measure their performance in achieving the objectives. For example, if a fitness facility aims to have a set number of active memberships monthly, they can simply count the number of memberships. While the number of memberships shows if they met their goal or not, this might not provide enough information to the administrators especially if the facility didn’t meet the goal. Looking at retention rate for current members and the number of new memberships acquired would provide more detailed insights on why they failed to meet their goal. As you see a variety of data is relevant in answering one question, and data will provide insights only if you start with a question and collect relevant data.

2 Data source – where can this data be obtained?

Based on the type of data needed, the source of data will change. Sport organizations can obtain data both from internal sources and external sources. Internal data could be gathered from finance, manufacturing, research and development, and human resources departments. Marketing departments can also provide internal data such as Return on Investment (ROI) metrics of advertisements (more details are provided in Chapter 4). External data could be gathered from suppliers and customers, and also could be purchased from a third party such as Nielsen TV ratings or Scarborough customer data.

3 Data quantity – how much data is needed?

Once the type of data needed and how to acquire it are decided, the next question to tackle is how much data is needed? The answer to this question is “it depends.” In some cases, a sport organization might have data of an entire population, whereas in other cases they can only access a sample. How much data one needs is especially important when analysis is done with data collected from a sample that requires power analysis to identify adequate sample size. In addition to sample size, representativeness of the sample is also an important concern for the accuracy of findings. These concepts will be covered in more detail in the “Some key statistical concepts” section of this chapter.

4 Data quality – how can the data be made more accurate and valuable for analysis?

The next step that requires attention is the quality of the data. A large data set is not always the answer. Quality data is needed to achieve valid and reliable results. Some of the important aspects of data quality are completeness, accuracy, consistency, and currency.
Completeness: Availability of all necessary and relevant data.
Accuracy: Reflecting real-life situations and being precise.
Consistency: Being consistent between systems with common definitions and standardization, and avoiding duplicate records in data.
Currency: Being updated periodically – daily, weekly, monthly.

Types of data

Most often data is extracted from its source in raw format, and needs to be cleaned by removing incorrect, incomplete and duplicate information and then transformed to be useable. Once data goes through the cleansing and transformation processes and is stored in a database, it becomes ready for analyses. Here, understanding the type of data becomes important, because the type of data and the level of measurement dictate the type of analyses one could perform. Data could be qualitative (descriptive information) or quantitative (numeric information), and quantitative data could be further classified as discrete or continuous. Discrete data can only have certain values (integers), and negative values and decimals are not possible. On the other hand, continuous data can have infinite possibilities with no gaps (e.g. 1.1, 1.135, 1.2, and 2.367) (Lomax, 2007). For example, the number of tickets sold would be an example of discrete data due to ticket sales numbers being integer numbers, and height of athletes or time in a race would be examples of continuous data.
Another important concept to understand about the quantitative data is the levels of measurement which are classified in four levels:
Nominal: At nominal level of measurement, numbers are used to classify data. Most of you are familiar with this type of data in classification of genders such as assigning 1 to males and 2 to females in your data set. In this type of classification, numbers do not mean anything other than showing a classification and do not have an order. If we go back to our example of classification of genders, 2 is not higher or better than 1 in any way and numbers are used solely to classify groups.
Ordinal: This type of scale displays some type of order between the numbers with respect to the characteristic being measured. For example, at a road race, the runner who completes the course in the shortest time would be ranked as first, and the others finishing the race following the winner would be ranked second, third and so on based on their time. Although rank order of 1, 2 and 3 seem to have equal distance between them, the differences between the numbers are approximate and unequal. Going back to our example, the difference between the time of first and second runners is not expected to be the same as the difference between the time of second and third runners, and so on. Therefore, an ordinal scale communicates an order; but does not claim equal distance between the points on the scale.
Interval: Similar to ordinal scale, interval scale orders the measurements, but it also provides equal distances between the points on the scale. One of the common examples of interval scale is IQ scores. Average IQ score is 100, and the difference between IQ scores of 80 and 90 is equal to the difference between scores of 100 and 110. In addition, lower scores show lower IQ levels and higher scores show higher IQ levels. One important aspect of an interval scale is not having a true zero point which means a zero on an interval scale does not indicate an absence of the property that is being measured. Therefore, it cannot be said an individual with an IQ score of 140 is twice as smart as another individual with an IQ score of 70.
Ratio: The ratio scale carries all characteristics of interval scale and also has a true zero which indicates the absence of the quality being measured. Going back to the runner example, at the beginning of the race, the clock is set to zero minutes and seconds, and if the winner finished a 5-kilometer road race in 18 minutes, he could be said to be twice as fast as a runner who finished the race in 36 minutes.

Some key statistical concepts

Before moving into various analyses, it is important to remember some key statistical concepts. Statistics in general is divided into two types, descriptive statistics and inferential statistics. Descriptive statistics summarize and describe data via frequencies, central tendency, measures of dispersion and distribution characteristics. Some examples from the sport world would be batting average in baseball, number of turnovers or steals in basketball, demographic characteristics of a team’s fan base in percentages or counts, and so on. These statistics could be calculated based on a sample or could be calculated for an entire population and would be called parameters. A sample is “a subset of a population,” and a population is defined as “consisting of all members of a well-defined group” (Lomax, 2007, p.6). Traditionally, analyses often rely on a sample and inferences are made about a population from the sample data via inductive reasoning, which is called inferential statistics. In this process, how the sample is acquired is extremely important as inferential statistics are based on the assumption that sampling is done randomly. Simple random sampling is selecting a sample from a population with a process that gives each observation an equal and independent chance of being selected (Lomax, 2007). The importance of simple random sampling relies on the idea that the sample will be representative of the population and the results of inferential statistics will be generalizable to the population. For example, if we were to ask our season ticket holders about their experience at our games, we could reach out to all season ticket holders or survey a sample of them. For the sake of this example, let’s assume that we decided to collect data from a sample of season ticket holders who were randomly selected from the entire season ticket holder pool. If our sample was large enough, then the results derived from the sample would be generalizable to all of our season ticket holders. This brings us to the topic of adequate sample size and limitations of small sample size in inferential statistics. The main idea is as sample size increases, we are sampling a larger portion of the population and therefore the sample becomes more representative of the population (Lomax, 2007).
Hypothesis testing is another concept to cover before moving into types of analyses. Hypothesis testing is a decision-making method where two competing decisions, which are known as null hypothe...

Table of contents

Citation styles for Sport Analytics
APA 6 Citation
[author missing]. (2016). Sport Analytics (1st ed.). Taylor and Francis. Retrieved from (Original work published 2016)
Chicago Citation
[author missing]. (2016) 2016. Sport Analytics. 1st ed. Taylor and Francis.
Harvard Citation
[author missing] (2016) Sport Analytics. 1st edn. Taylor and Francis. Available at: (Accessed: 14 October 2022).
MLA 7 Citation
[author missing]. Sport Analytics. 1st ed. Taylor and Francis, 2016. Web. 14 Oct. 2022.