This volume, representing a compilation of authoritative reviews on a multitude of uses of statistics in epidemiology and medical statistics written by internationally renowned experts, is addressed to statisticians working in biomedical and epidemiological fields who use statistical and quantitative methods in their work. While the use of statistics in these fields has a long and rich history, explosive growth of science in general and clinical and epidemiological sciences in particular have gone through a see of change, spawning the development of new methods and innovative adaptations of standard methods. Since the literature is highly scattered, the Editors have undertaken this humble exercise to document a representative collection of topics of broad interest to diverse users. The volume spans a cross section of standard topics oriented toward users in the current evolving field, as well as special topics in much need which have more recent origins. This volume was prepared especially keeping the applied statisticians in mind, emphasizing applications-oriented methods and techniques, including references to appropriate software when relevant.Ā· Contributors are internationally renowned experts in their respective areasĀ· Addresses emerging statistical challenges in epidemiological, biomedical, and pharmaceutical researchĀ· Methods for assessing Biomarkers, analysis of competing risksĀ· Clinical trials including sequential and group sequential, crossover designs, cluster randomized, and adaptive designsĀ· Structural equations modelling and longitudinal data analysis
Trusted byĀ 375,005 students
Access to over 1.5 million titles for a fair monthly price.
Handbook of Statistics, Vol. 27, No. suppl (C), 2008
ISSN: 0169-7161
doi: 10.1016/S0169-7161(07)27001-4
1 Statistical Methods and Challenges in Epidemiology and Biomedical Research
Ross L. Prentice
Abstract
This chapter provides an introduction to the role, and use, of statistics in epidemiology and in biomedical research. The presentation focuses on the assessment and understanding of health-related associations in a study cohort. The principal context considered is estimation of the risk of health events in relation to individual study subject characteristics, exposures, or treatments, generically referred to as ācovariatesā. Descriptive models that focus on relative and absolute risks in relation to preceding covariate histories will be described, along with potential sources of bias in estimation and testing. The role, design, and conduct of randomized controlled trials will also be described in this prevention research context, as well as in therapeutic research. Some aspects of the sources and initial evaluation of ideas and concepts for preventive and therapeutic interventions will be discussed. This leads naturally to a discussion of the role and potential of biomarkers in biomedical research, for such purposes as exposure assessment, early disease diagnosis, or for the evaluation of preventive or therapeutic interventions. Recently available biomarkers, including high-dimensional genomic and proteomic markers, have potential to add much knowledge about disease processes and to add specificity to intervention development and evaluation. These data sources are attended by many interesting statistical design and analysis challenges. A brief discussion of ongoing analytic and explanatory analyses in the Womenās Health Initiative concludes the presentation.
1 Introduction
The topic of this chapter is too broad to allow an in-depth coverage of its many important aspects. The goal, rather, will be to provide an introduction to some specific topics, many of which will be covered in later chapters, while attempting to provide a unifying framework to motivate statistical issues that arise in biomedical research, and to motivate some of the models and methods used to address these issues.
Much of epidemiology, and biomedical research more generally, involves following a set of study āsubjectsā, often referred to as the study cohort. Much valuable basic biological research involves the study of lower life forms. Such studies are often attended by substantial homogeneity among study subjects, and relatively short life spans. Here, instead, the presentation will focus on a cohort of humans, in spite of the attendant greater heterogeneity and statistical challenges. For research purposes the individuals in a cohort are of interest through their ability to yield health-related information pertinent to a larger population. Such a larger population may, for example, include persons residing in the geographic areas from which cohort members are drawn, who meet certain eligibility and exclusionary criteria. The ability to infer health-related information about the larger population involves assumptions about the representativeness of the cohort for the ātargetā population. This typically requires a careful characterization of the cohort so that the generalizability of study findings can be defined. The target population is often somewhat conceptual, and is usually taken to be practically infinite in size. The major long-term goal of biomedical research is to decrease the burden of premature disease morbidity and mortality, and to extend the period of time that members of target populations live without major health-related restrictions.
The principal focus of epidemiologic research is understanding the determinants of disease risk among healthy persons, with a particular interest in modifiable risk factors, such as dietary or physical activity patterns, or environmental exposures. There is a long history of epidemiologic methods development, much of which is highly statistical, whose aim is to enhance the likelihood that associations between study subject characteristics or exposures and disease risk are causal, thereby providing reliable concepts for disease prevention.
The availability of disease screening programs or services, and the health care-seeking behavior of cohort members, have potential to affect the timing of disease diagnosis. Early disease detection may allow the disease course to be interrupted or altered in a manner that is beneficial to the individual. Disease screening research has its own set of methodologic challenges, and is currently the target of intensive efforts to discover and validate early detection ābiomarkersā.
Much biomedical research is directed to the study of cohorts of person having a defined disease diagnosis, with emphasis on the characterization of prognosis and, especially, on the development of treatments that can eradicate the disease or can facilitate disease management, while avoiding undue adverse effects.
The ultimate products of biomedical research are interventions, biomarkers, or treatments that can be used to prevent, diagnose, or treat disease. Additionally, the knowledge of the biology of various life forms and the methodologic knowledge that underlies the requisite research agenda, constitutes important and durable contributions from biomedical research. These developments are necessarily highly interdisciplinary, and involve a wide spectrum of disciplines. Participating scientists may include, for example, molecular geneticists studying biological processes in yeast; technologists developing ways to assess a personās genome or proteome in a rapid and reliable fashion; population scientists studying disease-occurrence patterns in large human cohorts; and expert panels and government regulators synthesizing research developments and providing recommendations and regulations for consumption by the general population.
Statisticians and other quantitative scientists have important roles to fulfill throughout this research spectrum. Issues of study design, quality control, data analysis, and reporting are important in each biomedical research sector, and resolving methodologic challenges is crucial to progress in some areas. The biomedical research enterprise includes natural tensions, for example, basic versus applied research; in-depth mechanistic research versus testing of current concepts; and independent versus collaborative research. Statisticians can have a unifying role across related cultural research norms, through the opportunity to bring ideas and motivations from one component of this research community to another in a non-threatening manner, while simultaneously applying critical statistical thinking and methods to the research at hand.
2 Characterizing the study cohort
A general regression notation can be used to represent a set of exposures and characteristics to be ascertained in a cohort under study. Let z(u)ā²={z1(u), z2(u), ā¦} be a set of numerically coded variables that describe an individualās exposures and characteristics at ātimeā u, where, to be specific, u can be defined as time from selection into the cohort, and a prime (ā²) denotes vector transpose. Let Z(t)={z(u), u<t} denote the history of each covariate at times less than t. The ābaselineā covariate history Z(0) may include information that pertains to time periods prior to selection into the cohort.
Denote by Ī»{t, Z(t)} the occurrence rate for a health event of interest in the targeted population at cohort follow-up time t, among persons having a preceding covariate history Z(t). A typical cohort study goal is to assess the relationship between aspects of Z(t) and the corresponding disease rate Ī»{t; Z(t)}. Doing so involves recording over time the pertinent covariate histories and health event histories for cohort members, whether the cohort is comprised of healthy individuals as in an epidemiologic cohort study or disease prevention trial, or persons having a defined disease in a therapeutic context. The notation Z(t) is intended to encompass evolving, time-varying covariates, but also to include more restrictive specifications in which, for example, only baseline covariate information is included.
A cohort available for study will typically have features that distinguish it from the target population to which study results may be applied. For example, an epidemiologic cohort study may enroll persons who are expected to continue living in the same geographic area for some years, or who are expected to be able and willing to participate in research project activities. A therapeutic cohort may have characteristics that depend on institutional referral patterns and clinical investigator experience and expertise. Hence, absolute health event (hereafter ādiseaseā) occurrence rates may be less pertinent and transferable to the target population, than are relative rates that contrast disease rates among persons receiving different treatments or having different exposures.
The hazard ratio regression model of Cox (1972) captures this relative risk notion, without imposing further restrictions on corresponding absolute rates. It can be written
(1)
where x(t)ā²={x1(t), ā¦, xp(t)} is a modeled regression p-vector formed from Z(t) and product (interaction) terms with t, βā²=(β1, ā¦, βp) is a corresponding hazard ratio, or relative risk, parameter to be estimated, and Ī»0(Ā·) is an unrestricted ābaselineā hazard function corresponding to x(t)ā”0. For example, x(t)ā”x1 may be an indicator variable for active versus placebo treatment in a prevention trial, or an indicator for test versus the standard treatment in a therapeutic trial, in which case
is the ratio of hazard rates for the test versus the control group, and there may be special interest in testing β1=0 (
). Such a constant hazard ratio model can be relaxed, for example, to x(t)={x1, x1 log t} in which case the ātreatmentā hazard ratio function becomes
, which varies in a smooth manner with āfollow-up timeā t. Alternatively, one may define x(t) to include a quantitative summary of a study subjectās prior exposure to an environmental or lifestyle factor in an epidemiologic context.
Let T be the time to occurrence of a disease under study in a cohort. Typically some, and perhaps most, of cohort members will not have experienced the disease at the time of data analysis. Such a cohort member yields a ācensored disease event timeā that is known to exceed the follow-up time for the individual. Let Y be a process that takes value Y(t)=1 if a subject is āat riskā (i.e., without prior censoring or disease occurrence) for a disease event at follow-up time t, and Y(t)=0 otherwise. Then a basic independent censoring assumption requires
so that the set of individuals under active follow-up is assumed to have a disease rate that is representative for the cohort given Z(t), at each follow-up time t. The hazard ratio parameter β in (1) is readily estimated by maximizing the so-called partial likelihood function (Cox, 1975)
(2)
where t1, ā¦, tk are the distinct disease occurrence times in the cohort and R(t) denotes the set of cohort members at risk (having Y(t)=1) at follow-up time t. Standard likelihood procedures apply to (2) for testing and estimation on β, and convenient semiparametric estimators of the cumulative baseline hazard function
are al...
Table of contents
Cover image
Title page
Table of Contents
Preface
Contributors
Chapter 1 Statistical Methods and Challenges in Epidemiology and Biomedical Research
Chapter 2 Statistical Inference for Causal Effects, With Emphasis on Applications in Epidemiology and Medical Statistics
Chapter 3 Epidemiologic Study Designs
Chapter 4 Statistical Methods for Assessing Biomarkers and Analyzing Biomarker Data
Chapter 5 Linear and Non-Linear Regression Methods in Epidemiology and Biostatistics
Chapter 6 Logistic Regression
Chapter 7 Count Response Regression Models
Chapter 8 Mixed Models
Chapter 9 Survival Analysis
Chapter 10 A Review of Statistical Analyses for Competing Risks
Chapter 11 Cluster Analysis
Chapter 12 Factor Analysis and Related Methods
Chapter 13 Structural Equation Modeling
Chapter 14 Statistical Modeling in Biomedical Research: Longitudinal Data Analysis
Chapter 15 Design and Analysis of Cross-Over Trials
Chapter 16 Sequential and Group Sequential Designs in Clinical Trials: Guidelines for Practitioners
Chapter 17 Early Phase Clinical Trials: Phases I and II
Chapter 18 Definitive Phase III and Phase IV Clinical Trials
Chapter 19 Incomplete Data in Epidemiology and Medical Statistics
Chapter 20 Meta-Analysis
Chapter 21 The Multiple Comparison Issue in Health Care Research
Chapter 22 Power: Establishing the Optimum Sample Size
Chapter 23 Statistical Learning in Medical Data Analysis
Chapter 24 Evidence Based Medicine and Medical Decision Making
Chapter 25 Estimation of Marginal Regression Models with Multiple Source Predictors
Chapter 26 Difference Equations with Public Health Applications
Chapter 27 The Bayesian Approach to Experimental Data Analysis
Subject Index
Handbook of Statistics Contents of Previous Volumes
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, weāve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere ā even offline. Perfect for commutes or when youāre on the go. Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Epidemiology and Medical Statistics by in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over 1.5 million books available in our catalogue for you to explore.