Statistical Methods in Psychiatry and Related Fields
eBook - ePub

Statistical Methods in Psychiatry and Related Fields

Longitudinal, Clustered, and Other Repeated Measures Data

  1. 352 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Statistical Methods in Psychiatry and Related Fields

Longitudinal, Clustered, and Other Repeated Measures Data

About this book

Data collected in psychiatry and related fields are complex because outcomes are rarely directly observed, there are multiple correlated repeated measures within individuals, there is natural heterogeneity in treatment responses and in other characteristics in the populations. Simple statistical methods do not work well with such data. More advanced statistical methods capture the data complexity better, but are difficult to apply appropriately and correctly by investigators who do not have advanced training in statistics.

This book presents, at a non-technical level, several approaches for the analysis of correlated data: mixed models for continuous and categorical outcomes, nonparametric methods for repeated measures and growth mixture models for heterogeneous trajectories over time. Separate chapters are devoted to techniques for multiple comparison correction, analysis in the presence of missing data, adjustment for covariates, assessment of mediator and moderator effects, study design and sample size considerations. The focus is on the assumptions of each method, applicability and interpretation rather than on technical details.

Features

  • Provides an overview of intermediate to advanced statistical methods applied to psychiatry.


  • Takes a non-technical approach with mathematical details kept to a minimum.


  • Includes lots of detailed examples from published studies in psychiatry and related fields.


  • Software programs, data sets and output are available on a supplementary website.

The intended audience are applied researchers with minimal knowledge of statistics, although the book could also benefit collaborating statisticians. The book, together with the online materials, is a valuable resource aimed at promoting the use of appropriate statistical methods for the analysis of repeated measures data.

Ralitza Gueorguieva is a Senior Research Scientist at the Department of Biostatistics, Yale School of Public Health. She has more than 20 years experience in statistical methodology development and collaborations with psychiatrists and other researchers, and is the author of over 130 peer-reviewed publications.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Statistical Methods in Psychiatry and Related Fields by Ralitza Gueorguieva in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher
CRC Press
Year
2017
Print ISBN
9780367830861
eBook ISBN
9781351647564
1
Introduction
Repeatedly measured data are paramount in medicine, epidemiology, public health, psychology, sociology, and many other fields. The simplest case of such data is when a single measurement is collected repeatedly on the same individual or experimental unit. Each repeated measurement is called an observation and observations can be obtained over time, over a spatial map, or can be unordered temporally or spatially but nested (clustered) within larger experimental units.
Clustered data occur when repeated measures are not ordered and can be considered symmetrical within the larger experimental unit (cluster). For example, members of the same household can be interviewed, and in this case, their responses are repeated measures within the family. The family, rather than the individual, is the experimental unit and serves as the cluster. Observations on different individuals within the cluster are likely to be related to one another because individuals share the same environment and/or genetic predisposition. Similarly, patients may be clustered (or nested) within the same therapy group or clinic. Their treatment responses are also expected to be related because of the common influence of group or clinic, and can be considered repeated measures within the group or clinic. Several layers of clustering can be present in a data set. For example, the individual can be nested within family and the family can be nested within the neighborhood.
Longitudinal data occur when repeated measures are collected over time. In clinical trials in psychiatry and related fields, often the same rating instrument is administered to each individual at baseline, at intermediate time points, at the end of the randomized phase, and at follow-up. For example, depression severity can be measured weekly, biweekly, or monthly, in order to assess treatment effects over time. Similarly, in observational studies, the natural progression of a disease or other measures is ascertained repeatedly over time. In animal or human laboratory experiments, often responses from the same individual to different randomly ordered experimental conditions are recorded and compared.
Spatial data occur when repeated measures are spatially related. In imaging data sets, voxels are arranged in three-dimensional space where an observed value in a particular voxel is likely related to the observed values in neighboring voxels. In functional imaging studies, brain activation maps are created and often averaged region of interest signals are analyzed in order to measure and compare responses to different stimuli. In epidemiological studies, disease maps over geographical areas are created and analyzed. Methods for voxel-based data analysis of imaging studies and geographic and information systems are beyond the scope of this book, but we consider region of interest analyses of imaging data.
In all these situations, repeated observations within the same individual or cluster are related. Failure to take this interrelationship into account in statistical analyses, can lead to flawed conclusions. In this chapter, we review some terminology relevant to repeated measures data, such as mean response and measures of variability and correlation, present types of studies with longitudinal and clustered data, discuss advantages and challenges of collecting and analyzing repeatedly measured data, describe data sets that are used for illustration throughout the book, and provide a brief historical overview of approaches for the analysis of correlated data. We focus on continuous (quantitative, dimensional) measures. Later chapters deal specifically with categorical measures which can be dichotomous, ordinal with few ordered levels, and nominal (unordered). Some statistical terminology and basic notation is presented in Section 1.7 and can be skipped by readers who are confident of their statistical knowledge of basic concepts. Statistical Analysis System (SAS) code for the graphs in this chapter and for all models considered in further chapters, together with actual output and available data sets, is available on the book website.
1.1Aspects of Repeated Measures Data
1.1.1Average (Mean) Response
The goals of many studies with repeatedly measured data are to estimate the average response in a population of interest and see whether it changes significantly as a result of treatment, exposure, covariates, and/or time. Herein, response is used in the sense of an outcome (outcome variable, dependent variable) that measures the main characteristic in the population of interest. Population is the target group of individuals for whom statistical inference should be generalizable and from where the study sample is obtained. For example, in depression studies, the response can be depression severity measured by a standard depression rating scale, such as the Hamilton Depression Rating Scale (HDRS), or a dichotomous measure of improvement defined as at least 50% decrease from baseline on the HDRS, and the population can be all individuals with major depression. In substance abuse studies, the response can be the percentage of days without substance use in a particular time period, and the population can be all individuals with alcohol dependence. In functional imaging studies, the response can be activation change in a brain region and the population can be all individuals, healthy or otherwise. The sample should be randomly obtained from the population if it is to be representative of the population of interest.
Average response refers to the mean of the individual responses in the sample or the population. In the simplest case of a single random sample from a population without repeated measures, the sample average response is just the arithmetic mean of all response values for the individuals in the sample (see Section 1.7 for exact formula). The ­population-average response is the mean response of all individuals in the population and, since it is usually not possible to measure, we use the sample mean to make inferences about the population mean. In longitudinal or clustered data, response is measured repeatedly within the individual over time or within the cluster, and the average response is usually a sequence or collection of numbers that correspond to each repeated measurement occasion. For example, the average response in a depression clinical trial that takes 8 weeks may be a sequence of eight averages of the individual responses (one for each week of the study). The average response in an imaging study may be a collection of several average responses, each corresponding to a different brain region.
Average response usually depends on a number of predictors. In clinical trials, we always have treatment as the main predictor of interest while participant characteristics such as age, gender, and disease severity are additional predictors that can also affect the response. Such additional predictors are usually called covariates. In observational studies, we might be interested in the effect of exposures, such as smoking or drinking, on the response. In imaging studies, we may want to measure brain activation while individuals perform different tasks. In all these situations, estimation of the average response and how it depends on different predictors is of primary interest.
1.1.2Variance and Correlation
The variability and interdependence of repeated measures within the individual or cluster are usually of secondary interest, although there are situations where they may be of equal or even higher interest than the estimation of the average response. For example, in clinical trials, the main goal may be to test whether an experimental treatment is on average better than a standard treatment or a placebo in terms of improvement in response over time. The variability in the responses of individuals needs to be taken into account but it is usually not of primary interest. However, it is possible that the experimental treatment may have a very similar average response to the standard treatment, but inter-individual variability in response may be lower (i.e., individuals may respond to treatment more consistently and similarly to one another). In this situation, the new treatment may be preferable and estimation of the variability of response is of interest too.
Variability of observations around the mean from a simple random sample is described by the variance or standard deviation of the observations (see Section 1.7). The sample standard deviation is often preferable as it provides a measure of variability that is evaluated in the same units as the mean. In repeated measures situations with longitudinal data, often the variability of the response at one particular time point differs from the variability at another, in which case it makes sense to estimate separate variances in order to assess data spread at individual time points. However, in some situations it may be reasonable to assume that the variances on all repeated occasions are the same. In this case, a better statistical estimate of the common variance can be obtained by pooling information from all occasions. Examples of both scenarios are considered in Chapter 2.
Repeated measures within individuals or clusters are often correlated. Correlation reflects the degree of linear dependence between two variables and varies between –1 and 1. It is important to emphasize that the definition includes the word “linear.” Two variables may be perfectly related in a curvilinear fashion and have a correlation of zero. Correlation values of 1 or –1 correspond to perfect linear dependence between two variables. In these cases, knowing the values of one of the variables exactly predicts the values of the other variable, but does not imply that the two variables take the same value. Correlations are positive when larger values on one of the variables correspond to larger values on the other variable. Correlations are negative when larger values on one of the variables correspond to smaller values on the other variable. Please note that the proper statistical term for the latter case is “negative cor...

Table of contents

  1. Cover
  2. Half-Title
  3. Series
  4. Title
  5. Copyright
  6. Dedication
  7. Contents
  8. Preface
  9. 1 Introduction
  10. 2 Traditional Methods for Analysis of Longitudinal and Clustered Data
  11. 3 Linear Mixed Models for Longitudinal and Clustered Data
  12. 4 Linear Models for Non-Normal Outcomes
  13. 5 Non-Parametric Methods for the Analysis of Repeatedly Measured Data
  14. 6 Post Hoc Analysis and Adjustments for Multiple Comparisons
  15. 7 Handling of Missing Data and Dropout in Longitudinal Studies
  16. 8 Controlling for Covariates in Studies with Repeated Measures
  17. 9 Assessment of Moderator and Mediator Effects
  18. 10 Mixture Models for Trajectory Analyses
  19. 11 Study Design and Sample Size Calculations
  20. 12 Summary and Further Readings
  21. References
  22. Index