Artificial Intelligence in Sport Performance Analysis
eBook - ePub

Artificial Intelligence in Sport Performance Analysis

Duarte Araújo, Micael Couceiro, Ludovic Seifert, Hugo Sarmento, Keith Davids

Share book
  1. 196 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Artificial Intelligence in Sport Performance Analysis

Duarte Araújo, Micael Couceiro, Ludovic Seifert, Hugo Sarmento, Keith Davids

Book details
Book preview
Table of contents
Citations

About This Book

To understand the dynamic patterns of behaviours and interactions between athletes that characterize successful performance in different sports is an important challenge for all sport practitioners. This book guides the reader in understanding how an ecological dynamics framework for use of artificial intelligence (AI) can be implemented to interpret sport performance and the design of practice contexts.

By examining how AI methodologies are utilized in team games, such as football, as well as in individual sports, such as golf and climbing, this book provides a better understanding of the kinematic and physiological indicators that might better capture athletic performance by looking at the current state-of-the-art AI approaches.

Artificial Intelligence in Sport Performance Analysis provides an all-encompassing perspective in an innovative approach that signals practical applications for both academics and practitioners in the fields of coaching, sports analysis, and sport science, as well as related subjects such as engineering, computer and data science, and statistics.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Artificial Intelligence in Sport Performance Analysis an online PDF/ePUB?
Yes, you can access Artificial Intelligence in Sport Performance Analysis by Duarte Araújo, Micael Couceiro, Ludovic Seifert, Hugo Sarmento, Keith Davids in PDF and/or ePUB format, as well as other popular books in Informatique & Traitement des données. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2021
ISBN
9781000380156

1
EMPOWERING HUMAN INTELLIGENCE
The Ecological Dynamics Approach to Big Data and Artificial Intelligence in Sport Performance Preparation

Big Data in Sport

Digital technology has had a profound impact on sport (Miah, 2017). Athletes and coaches rely on digital data to monitor and enhance the performance. Officials use tracking systems to augment their judgement. Audiences use collective shared data purported to expand the places in which sports can be watched and experienced.
Nowadays, technology enables practitioners, performers, and spectators to collect and store a massive amount of data in faster, more abundant, and more diverse ways than ever. Data can be collected from various sensors and devices in different formats, from independent or connected applications. This data avalanche has outpaced human capability to process, analyse, store, and understand the information contained in these datasets. Moreover, people and devices are becoming increasingly interconnected. The increase in the number of such connected components generates a massive dataset, and valuable information needs to be discovered from patterns within the data to help improve performance, safety, health, and well-being. Not only have technological advancements led to an abundance of new data streams, repositories, and computational power, but they also have resulted in advances in statistical and computational techniques, such as artificial intelligence, that have proliferated widespread analysis of such datasets in many domains, including sport, improving our ability to plan, prepare, and predict performance outcomes. Therefore, it is unsurprising that big data is also entering research programmes in the sport sciences (Goes et al., 2020; Rein & Memmert, 2016; Chapter 2). Big data broadly refers to multiplying multiform data (e.g., structured, unstructured) and their supporting technological infrastructure (i.e., capture, storage, processing) and analytic techniques that can enhance research (Woo, Tay, & Proctor, 2020).
Big data, a term probably coined by John Mashey in the mid-1990s (Gandomi & Haider, 2015), is used to identify datasets that cannot be managed for a particular problem domain with traditional methodologies to obtain meaning, due to their large size and complexity (Proctor & Xiong, 2020). Consequently, Volume, Variety, and Velocity (the three Vs) have emerged as a common framework to describe big data. It is relevant to understand the meaning of the three Vs (Gandomi & Haider, 2015): (i) Volume is related to the size of data (many terabytes and even exabytes); (ii) Variety refers to the types of data (e.g., text, physical sensors data, audio, video, graph) and its structure (e.g., structured or unstructured); (iii) Velocity indicates the continuous generation of streams of data and the speed at which those data should be analysed. There are additional Vs being discussed nowadays (Proctor & Xiong, 2020) such as Variability (variation in the data flow), Veracity (imprecision of the data), and Value (obtain meaning to inform decisions in ways only possible with big data). Relatedly, big data mining is the capability of obtaining useful information from these large datasets (Fan & Bifet, 2014). One way of mining big data is by means of artificial intelligence, as described in the remaining chapters of this book.
For sport scientists and practitioners, the challenges start from understanding how to obtain and access data, followed by how to process and clean big data into formats usable for research and athlete support goals (Endel & Piringer, 2015). At the same time, the collected data may be incomplete, which requires methods to transform, detect, and deal with missing data. Also, the traditional statistical method of null hypothesis testing at 0.05 alpha level loses its meaning because very small differences can be statistically significant due to the very large sample sizes involved in big datasets. Thus, one obvious accompanying challenge is to understand how to obtain meaningful information and predictions from big data. One solution is to place more emphasis on statistics and computational modelling (Proctor & Xiong, 2020), such as machine learning (e.g., Couceiro, Dias, Mendes, & Araújo, 2013, see Chapter 2 for a review). Another possible complementary solution, discussed at the end of this chapter, is to become theoretically informed about what data to obtain, how to process it, and how to interpret it, instead of simply relying on computational brute force.

Sources of Big Data

Woo, Tay, Jebb, Ford, and Kern (2020) identified three major sources of big data as most frequently mentioned or used in current behavioural research: social media (e.g., Twitter, Facebook), wearable sensors (e.g., Garmin, Fitbit), and Internet activities (e.g., Internet searches, page views). Woo, Tay, Jebb, and colleagues (2020) also identified two other emergent data sources that provide an increasing amount of accessible data: public network cameras and smartphones (see Chapter 3 for the main sources used in sport sciences). These big data sources are different from more ‘traditional’ data sources (e.g., surveys, interviews) because the latter are typically much smaller in size, more slowly generated, and less technological. Importantly, ‘small’ and ‘big’ data range on a continuum rather than represent two distinct categories.
Recent advances in such technology have improved the ability to study sport performance (see Chapter 3) and the expression of expertise processes as they naturally occur (e.g., Baker & Farrow, 2015; Ericsson, Hoffman, Kozbelt, & Williams, 2018; Ward, Schraagen, Gore, & Roth, 2020). Ecological momentary assessment (EMA), which collects behavioural moment-by-moment data using electronic handheld devices, requires participants to answer questionnaires several times a day after being prompted by those devices. Although these devices can capture behavioural processes in situ, limitations such as survey fatigue, limitations in verbalizing behaviour, and limitations of response bias may influence self-report survey responses of participants (Blake, Lee, Rosa, & Sherman, 2020). Mobile sensing, using smartphones, is a more recent approach to collect data of naturalistic behaviour (e.g., Araújo, Brymer, Brito, Withagen, & Davids, 2019). Mobile sensing unobtrusively tracks a participantʼs physical location, his/her physical activity, and his/her physiological information (Júdice et al., 2020; Ram & Diehl, 2015). Sensors with Internet connectivity can provide a continuous stream of data. Many types of sensors could be inserted in wearables to collect specific data such as pressure sensors, accelerometers, and positional data (e.g., GIS and GPS, where GIS – stands for Geographical Information System – is a software program that helps people use the information that is collected from the GPS – Global Positioning System – satellites) (Woo, Tay, Jebb, et al., 2020). Each of these sensors can capture variables of physical behaviour, including location, posture, and movement (e.g., sitting, standing, walking, running), and physical proximity to other sensors (Chaffin et al., 2017). These technologies are ever-expanding, and can capture types of behavioural data not accessible previously, due to privacy and confidentiality concerns, costs, and practical limitations (Woo, Tay, Jebb, et al., 2020). What may be a unique advantage of smartphones is that they are highly portable, enabling researchers to cross survey data with behavioural (and often social media) observations recorded and shared through smartphone sensors, and plug-ins to capture a holistic account of participants’ behaviours and experiences (Harari et al., 2016, but see Fortes et al., 2019). Discussions of reliability and validity of behavioural measurements using smartphone data have been recently offered (Harari et al., 2016; Júdice et al., 2020; Woo, Tay, Jebb, et al., 2020).
Although advances in wearables capture important aspects of a personʼs environment, they provide no visual sense of what the person has encountered during their actions or how a setting may have visually changed. A wearable camera can provide raw information about an individualʼs visual environment and can fill this gap in the collection of naturalistic behaviour (Omodei & McLennan, 1994). Another possibility for performances in limited spaces is video recordings of single-person (e.g., golf) or interpersonal interactions such as those between athletes and teams and a performance environment. In the 1980s and 1990s, these video recordings were often coded and summarized; however, recent studies have implemented more intensive data collection strategies to examine group behaviours in sport (Araújo & Davids, 2016; Rein & Memmert, 2016). There are also video data available through public cameras, which usually have insufficient image resolution for fine-grained, individual- level behavioural analyses. However, they can still be used to capture some essential observations and meaningful patterns of individual, interpersonal, and group behaviours in public spaces, such as behaviours (e.g., physical activity) in public locations or crowd behaviours in sport events (Woo, Tay, Jebb et al., 2020). And these methods are rapidly improving to become a valid and useful approach (Adolph, 2016). Moreover, video recordings’ metric properties require careful a priori decisions on the appropriate unit of analysis for assessing the constructs of interest and boundaries around the time frame and what aspects of the image are included or excluded (e.g., Sanchez-Algarra & Anguera, 2013).

Validity and Reliability of Big Data Measurements

Big data sources include practical and ethical challenges, such as privacy, data security and storage, data sharing and validity, and replicability issues. Adjerid and Kelley (2018) alerted us to the fact that measurement quality needs to be improved for rigorous scientific work with big data. It is important to note that ‘more’ data does not necessarily improve the quality of measurement in research. Woo, Tay, Jebb, and colleagues (2020) discussed three ways in which the validity of big data measurements can be evaluated:
  1. Response processes (i.e., the congruence between the construct and the nature of response engaged in by participants). This is an area where close collaboration between computer scientists (with skills to analyse sensor data) and sport scientists (with theories and understandings of human performance and expertise) will be useful for making sense of what information is valuable, where technology has to go to create valuable information, and what observed patterns actually mean.
  2. Internal (factorial) structure. Studies with sensors have typically used single indicators, or the indicators simply provide an operational measure (Chaffin et al., 2017). In these cases, metrics about the internal structure are not relevant. However, combining multiple indicators can capture more complex constructs (Woo, Tay, Jebb et al., 2020). In these cases, exploratory or confirmatory factor models can be specified and tested for goodness of fit. It is also important to consider whether indicators are reflective of underlying constructs or they are formative (Edwards & Bagozzi, 2000). Given that sensor data provides multivariate time series data, longitudinal or dynamic factor models have to be developed and implemented to validate the use of indicators over time, at time scales that are both manageable (in terms of handling and analysing the data) and appropriate for capturing the construct under consideration (Davids et al., 2014).
  3. Nomological net relations to other variables. For a given construct of interest, it is important to examine both convergent and divergent associations across a range of potentially relevant constructs (Woo, Tay, Jebb et al., 2020).
Reliability estimates can be calculated on various parts of the data analysis process. If measurements are based on the notation of observers (e.g., registering the number of passes in a team ball game), reliability can be calculated using typical interobserver reliability statistics (e.g., kappa, intraclass correlation coefficient). For data-driven features (e.g., types of passes in a team ball game), sets of data can be randomly sampled, and the extent to which results replicate (e.g., the same words are classified in the same category) can be assessed. Alternatively, the continuous data can be split across several time points, and correlations across time sets indicate the cross-time consistency (Woo, Tay, Jebb et al., 2020). Chaffin et al. (2017) recently examined the reliability of sensor data on the use of wearable sensors for capturing behavioural variables. They discriminated between two sources of measurement error: problematic sensitivity differences between sensors (e.g., sensor A may be slightly more sensitive than sensor B), and differences within the same sensor across time. Their study suggests that researchers have to be careful in making inferences about differences between individuals because these differences may arise due to differences in sensors themselves.

Grasping Big Data with Visual Analytics

Big data analytics leaves out relevant contextual information in statistically modelling the world, which is complex, multidimensional, and intricate. Decision-makers contextualize analytical results within the broader context of society, and may even go against the analytical results of software to seek more favourable broader impacts when considering aspects that are reflected in the analysis (Karimzadeh, Zhao, Wang, Snyder, & Ebert, 2020). Given the abundant statistically significant relationships in big datasets, the meaningful relationships should be identified by the analyst. Visual analytics provides interactive access, organization, detail on demand, and contextual information to help decision-makers find relevant patterns. Visual analytics enhances computational algorithms by incorporating humansʼ extensive experience and domain knowledge that cannot be collected in the data (Karimzadeh et al., 2020). The contextual framework that makes humans identify what is relevant or meaningful for a given domain is based on cultural and social background practices.
Graphic displays of data, when properly designed, are efficient at communicating information and can provide more accessibility to valuable information that is not evident from big data. Data visualization is a set of methods for graphically, quickly, and accurately displaying data in a way that is easy to apprehend, and has two main functions: exploration and explanation (Sinar, 2015). The exploratory function of data visualization can help decision-makers identify underlying relationships that are hidden in raw data. The explanatory function can help them compose, analyse, and investigate research questions (Song, Liu, Tang, & Long, 2020). The analytical understanding of data is the basis of data visualization, but equally important is the perceived aesthetics of data visualizations. Advancements in both hardware and software have enabled computers to store and analyse massive amounts of data that often have high dimensionality. Some complex algorithms, such as machine learning models, are designed to reduce the massive amounts of complex data to manageable sizes and dimensions, and predict future states. However, the complexity and, at times, lack of transparency of the algorithms result in humans being unable to understand and trust the results (Burrell, 2016). Visual analytics combines the experience, contextual information, and expertise of the human user with the power of human-guided computational analysis, which, in turn, enhances the human decision-making process. Visual analytics incorporates the principles of design and cognitive science to identify appropriate visual analogies for data or analytical results, with a strong emphasis on creating perceptually effective representations for each analytical task (Karimzadeh et al., 2020).
Big data visualization techniques are not inherently different from small data visualization techniques. Capitalizing in what is traditionally used with small data, it is common to remodel familiar visualizations and integrate them within broader visual analytics systems with more interactivity and linked views (Robinson, 2011). For instance, the same visualizations for summarization (e.g., bar graphs, pie charts, line graphs) are used for big data but with more interactive features that can render additional elements and details on demand. Innovative data visualizations are more common with unstructured novel data sources. Visual analytics enables the integration of various computational models with interactive user interfaces for generating simulation results that facilitate testing various what-if scenarios for decision-m...

Table of contents