Growth Curve Analysis and Visualization Using R
eBook - ePub

Growth Curve Analysis and Visualization Using R

Daniel Mirman

Share book
  1. 188 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Growth Curve Analysis and Visualization Using R

Daniel Mirman

Book details
Book preview
Table of contents
Citations

About This Book

Learn How to Use Growth Curve Analysis with Your Time Course Data

An increasingly prominent statistical tool in the behavioral sciences, multilevel regression offers a statistical framework for analyzing longitudinal or time course data. It also provides a way to quantify and analyze individual differences, such as developmental and neuropsychological, in the context of a model of the overall group effects. To harness the practical aspects of this useful tool, behavioral science researchers need a concise, accessible resource that explains how to implement these analysis methods.

Growth Curve Analysis and Visualization Using R provides a practical, easy-to-understand guide to carrying out multilevel regression/growth curve analysis (GCA) of time course or longitudinal data in the behavioral sciences, particularly cognitive science, cognitive neuroscience, and psychology. With a minimum of statistical theory and technical jargon, the author focuses on the concrete issue of applying GCA to behavioral science data and individual differences.

The book begins with discussing problems encountered when analyzing time course data, how to visualize time course data using the ggplot2 package, and how to format data for GCA and plotting. It then presents a conceptual overview of GCA and the core analysis syntax using the lme4 package and demonstrates how to plot model fits. The book describes how to deal with change over time that is not linear, how to structure random effects, how GCA and regression use categorical predictors, and how to conduct multiple simultaneous comparisons among different levels of a factor. It also compares the advantages and disadvantages of approaches to implementing logistic and quasi-logistic GCA and discusses how to use GCA to analyze individual differences as both fixed and random effects. The final chapter presents the code for all of the key examples along with samples demonstrating how to report GCA results.

Throughout the book, R code illustrates how to implement the analyses and generate the graphs. Each chapter ends with exercises to test your understanding. The example datasets, code for solutions to the exercises, and supplemental code and examples are available on the author's website.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Growth Curve Analysis and Visualization Using R an online PDF/ePUB?
Yes, you can access Growth Curve Analysis and Visualization Using R by Daniel Mirman in PDF and/or ePUB format, as well as other popular books in Mathématiques & Probabilités et statistiques. We have over one million books available in our catalogue for you to explore.

Information

Year
2017
ISBN
9781315360331

1

Time course data

CONTENTS

1.1 Chapter overview
1.2 What are “time course data”?
1.3 Key challenges in analyzing time course data
1.3.1 Trade-off between power and resolution
1.3.2 Possibility of experimenter bias
1.3.3 Statistical thresholding
1.3.4 Individual differences
1.4 Visualizing time course data
1.5 Formatting data for analysis and plotting
1.5.1 A note on data aggregation
1.6 Chapter recap
1.7 Exercises

1.1 Chapter overview

This chapter will describe the main problems that growth curve analysis is meant to address. First, it will define a particular kind of data, called time course data or longitudinal data, which involve systematic relationships between observations at different time points. These relationships pose problems for simple traditional analysis methods like t-tests.
Section 1.3 will discuss four kinds of problems and illustrate them with concrete examples. First, using separate analyses for individual time bins or time windows creates a trade-off between power (more data in each bin) and temporal resolution (smaller time bins). Second, flexibility in selection of time bins or windows for analysis introduces experimenter bias. Third, statistical thresholding (p < 0.05 is significant but p > 0.05 is not) makes gradual change look abrupt and creates the illusion that continuous processes are discrete. Fourth, there is no clear way to quantify individual differences, which are an important source of constraints for theories in the behavioral sciences.
Section 1.4 will provide a brief introduction to ggplot2, a powerful and flexible package for graphing data in R. Section 1.5 will distinguish between wide and long data formats and describe how to use the melt function to convert data from the wide to the long format, which is the right format for growth curve analysis and for plotting with ggplot2. The rest of this book will describe growth curve analysis, a multilevel regression method that addresses the challenges discussed in this chapter, provide a guide to applying growth curve analysis to time course data, and demonstrate how to use ggplot2 to visualize time course data and growth curve model fits.

1.2 What are “time course data”

Time course data are the result of making repeated observations or measurements at multiple time points. These sorts of data are also called longitudinal or, more generally, repeated measures data. Imagine that you measured a child’s height annually from birth to 18 years old. You would have a series of 19 data points that describe how that child’s height changed over time during those 18 years. In other words, the growth (height) time course for that child.
Two key properties distinguish time course data from other kinds of data. The first is that groups of observations all come from one source, which is called nested data. In the height example, the source was a particular child. If you repeated this procedure for another child, you would now have two nested series of data points corresponding to the two children in your study. The heights of two randomly selected children may be uncorrelated, but the height of a child at time t is strongly correlated with that child’s height at time t – 1. Nested observations are not independent and this non-independence needs to be taken into account during data analysis. Capturing this nested structure allows quantifying the particular pattern of correlation among data points for an individual, which can reveal potentially interesting individual differences – a taller child compared to a shorter child, whether the child had an earlier or later growth spurt, etc.
In this example, the data were nested or grouped at the individual participant level. The grouping can also be at a higher level. For example, if you measured the weights of newborns at different hospitals every month for a year, you would have data grouped by hospital, rather than by individual child (each child was only weighed once, but each hospital’s newborns were weighed every month). Groupings can also be at multiple levels; for example, if you followed those children as they grew, you would have measurements grouped by child and children grouped by hospital.
The second key property of longitudinal data is that the repeated measurements are related by a continuous variable. Usually that variable is time, as in the child growth example, but it can be any continuous variable. For example, if you asked participants to name letters printed in different sizes, you could examine the outcome (letter recognition accuracy) as a function of the continuous predictor size. On the other hand, if you had presented letters from different alphabets (Latin, Cyrillic, Hebrew, etc.), that would be a categorical predictor. For categorical predictors, one can only assess whether the outcome was different between different categories (for example, if recognition of Latin letters was better or worse than recognition of Cyrillic letters). For continuous predictors, one can do that kind of simple comparison, but it is also possible to assess the shape of the change – whether the relationship between letter recognition accuracy and letter size follows a straight line, or accuracy improves rapidly for smaller sizes and then reaches a plateau, or follows a U-shape. Because time is so frequently that critical continuous variable, this book will typically refer to these sorts of data as “time course data” even though the same issues apply to other continuous predictors.
As we will see, growth curve analysis (GCA) is a way to analyze nested data that takes the grouping into account and provides a way to quantify and assess the shapes of time course curves. Before getting into GCA, it will help to understand the challenges of analyzing time course data in a little more detail. That is, to understand why traditional methods like t-test and analysis of variance (ANOVA) are not well-suited to these sorts of data. To do that, the next section goes over some examples of the kinds of problems that come up when analyzing time course data.

1.3 Key challenges in analyzing time course data

How should time course data be analyzed? A simple approach is to apply traditional data analysis techniques like t-tests or ANOVAs. For example, we could independently compare conditions at each time bin or time window. This approach has a number of problems, which are easiest to demonstrate with concrete examples.

1.3.1 Trade-off between power and resolution

The data in Figure 1.1 are based on an experiment that examined whether words with high “transitional probability” (TP) would be learned faster than words with low TP (Mirman, Magnuson, Graf Estes, & Dixon, 2008). Word learning was predicted to be faster in the high TP condition than the low TP condition. The training trials were grouped into blocks to examine the gradual learning. The data in Figure 1.1 are the word “learning curves”: the participants started out near chance (50% correct, because there are two response choices on each trial) and gradually got better, reaching about 90% correct at the end of 10 blocks of training trials. Importantly, it looks like this learning was faster for high TP words.
What kind of statistical test would provide the quantitative test of the effect of TP on word learning? Faster word learning means that participants in the High TP condition generally have higher accuracy, so we could do a t-test comparing the High and Low TP conditions on...

Table of contents