Introduction
As people age worldwide, preserving and improving cognition in later life is becoming more urgent. Most of us share the intuition that physical exercise is a good way to ensure a healthier aging mind. After all, many of the older people who are cognitively successful are also physically active. This observation has generated a multitude of studies seeking a causal link between physical exercise and cognitive health. Indeed, the hundreds of studies that have already been conducted in older adults generally suggest a positive correlation between physical exercise and cognition (for reviews, see Angevaren, Aufdemkampe, Verhaar, Aleman, & Vanhees, 2008; Blondell, Hammersley-Mather, & Veerman, 2014; Colcombe & Kramer, 2003; Kelly et al., 2014b). We must be mindful, however, that a positive correlation between exercise and healthy cognition indicates only that they are found in the same people. Based merely on correlational evidence, we cannot know whether exercise directly improves cognitive aging or, conversely, whether healthy cognitive aging supports a more active lifestyle. Exercise and cognitive health may be inter-related in more complex ways, and third variables such as social interaction might be keeping people active and cognitively intact in aging.
Most of us prefer the idea that we can influence the course of our cognitive aging through physical and/or mental exercise over the thought of being subject to the uncontrollable fate of cognitive decline. Scientists often share this perspective, leading them to frequently adopt the optimistic attitude that age-related cognitive decline can be attenuated significantly by physical exercise. However, a closer look at the existing data reveals significant methodological limitations that make it surprisingly difficult to establish a direct, causal link between physical exercise and the preservation of cognition in aging. Likewise, although cognitive exercise (i.e., ābrain trainingā) is becoming an increasingly popular and lucrative intervention to preserve cognitive function in aging, brain-training studies have suffered from many of the same challenges as those of physical exercise.
In a previous review, we delved into this literature in detail, noting methodological issues that make it difficult to confidently infer a direct, causal relationship between physical exercise and cognition in older adults (Miller, Taler, Davidson, & Messier, 2012). Here, we summarize the evidence on the effects of physical and cognitive exercise on cognitive aging, highlight the main methodological challenges in the area, and suggest key questions to consider when undertaking research on this topic. We point out various factors that deserve serious consideration when evaluating existing reports or undertaking new research. We conclude with suggestions to help researchers and practitioners in the design, implementation, and evaluation of research on physical and cognitive exercise. Throughout, we adopt a cautious attitude about causal relationships between exercise and cognitive aging. Of course, as the field accumulates more good data, our approach in this review may prove to be overly cautious. However, as we will outline, we think that at this stage care is warranted in claiming which factors contribute to healthy cognitive aging and in recommending what people do to increase their odds of successful cognitive aging.
A general consideration: choosing the right design for the right question
The overall design of a study determines the questions it can answer. Observational studies can be either retrospective or prospective, including data from one point in time (i.e., cross-sectional) or several (i.e., longitudinal). Their advantages include the possibility of being relatively inexpensive, large, and easy to run. Moreover, observational studies can provide insights into cross-sectional and longitudinal differences in cognition and physical exercise. Observational designs do not allow us, however, to establish causality. Experimental (or intervention) designs, in contrast, attempt to systematically control a variable of interest (i.e., exercise), while reducing the influence of potentially confounding factors. In this regard, experiments are better able to uncover causal relationships. This design is no panacea, however. Interventions are not immune to confounds, are often costly ā which usually leads to smaller sample sizes evaluated over a shorter period of time than ideal ā and suffer from selection biases and dropout.
Considerations in designing and interpreting studies of physical exercise and cognition
What, exactly, is physical exercise?
At first glance, this question might seem pedantic, but it is both important and difficult to answer. What is the operational definition of exercise in the study? Aside from questions about the duration and intensity of activity (see below for a discussion of these), there is little agreement across the literature on exactly which activities count as āexercise.ā Most researchers (and participants) would endorse swimming laps in the pool as an obvious example, but what about gardening? Housework? Sex? Each of these has been included in some previous studies but not others, which, at the very least, hinders the comparison of results across the literature. Although universal agreement on the definition of āexerciseā seems unlikely, greater consistency and a common vocabulary are needed urgently (Warren et al., 2010).
Another consideration in defining exercise regards the target physiological energy systems. An implicit consensus appears to exist across the literature that cardiorespiratory-focused exercise (e.g., swimming, running) contributes the most to effects on cognition, whereas strength-focused exercise (e.g., weight training) is less effective, followed by balance, toning, and flexibility (e.g., Tai Chi, yoga), which are least effective. This distinction often leads researchers to use activities that ostensibly belong to the last of these three categories as a control for activities from one of the first two categories. Nevertheless, the evidence for the superiority of cardiovascular exercise is mixed, particularly when one looks at interventions rather than observational studies (Colcombe & Kramer, 2003; Snowden et al., 2011). This may stem from the difficulty in assigning certain physical activities to only one of these three categories. For example, many aerobic exercises (e.g., running) cannot help but also yield improvements in strength as well as in balance, toning, and flexibility. Conversely, strength training can benefit cardiovascular function, either directly (e.g., by changing arterial stiffness; Li et al., 2014) or indirectly (for example, by increasing a joggerās core and leg muscle strength to allow her to run further and/or faster, driving her heart rate that much more). Ambiguity surrounding the specific cognitive benefits of particular kinds of exercise may also stem from the possibility that any activity that is physically, cognitively, or socially stimulating can boost cognition (Hayes, Hayes, Cadden, & Verfaellie, 2013; for more, see āWhat other factors must be ruled out?ā, below).
How should physical exercise be measured?
After deciding which behaviours are defined as exercise, the next challenge is to measure them. Researchers usually choose between subjective and objective measures. Subjective reports (i.e., self-report) can be formalized as questionnaires, diaries, logs, and so forth. Although they are inexpensive and easy to administer, their value can be diminished by participants changing their behaviour or their report to fit what is socially desirable (i.e., impression management) or what they believe the experimenters expect or want (i.e., demand characteristics) and by memory failures and biases, which may be especially relevant if participants are asked to recollect details about exercise from long ago. The importance of memory may help explain why testāretest reliability of exercise self-reports is notoriously low (Geda et al., 2010).
Objective methods of measuring and monitoring exercise vary. Laboratory measures include pulse, blood pressure, and the volumes of oxygen and carbon dioxide inhaled and exhaled when breathing under controlled maximal physical exertion, from which we derive the maximum oxygen consumption: VO 2 max. This last test is costly, takes significant time, and requires a specialized facility. Because the VO 2 max test typically requires participants to reach their maximal cardiorespiratory capacity, it can be counter-indicated for people with low fitness or significant health problems. As a real-world alternative, the first generation of personal devices for continuous monitoring of activity (e.g., stand-alone motion sensors, pedometers, accelerometers, and so forth) has now given way to a second generation of more accessible, easier to use, and less expensive measures. These small, discreet wearable sensors detect variables such as heart, pulse and respiration rates, and footsteps (which, depending on the system, can be categorized into walking, running, and stair-climbing). Typically, these allow continuous collection of data using a smartphone. New devices are coming onto the market every year, and prices are falling accordingly, increasing the likelihood that such devices will be used in larger studies. Of course, these devices are useful only if participants use them properly and continuously.
The rapid pace of product development for exercise monitors means that the scientific literature is lagging behind industry by several years. For instance, in July 2016, only 100 PubMed entries existed for āFitBit,ā one of the most popular wearable monitor companies; more crucially, only one entry existed for āFitbit and cognition.ā The bulk of the current literature on objective monitoring of physical activity has used the previous generation of devices, which often had problems with comfort and compliance, especially over the long term. Although these problems might be mitigated by newer technology, caution is warranted in their use: These devices can fail or deliver inaccurate data (Lee, Kim, & Welk, 2014; Sasaki et al., 2014), 1 and the data they do yield can be difficult to interpret. It is possible to estimate VO 2 max from resting and maximum heart rates (Uth, Sorensen, Overgaard, & Pedersen, 2005), but this estimate must be acquired during controlled exercise intensity, which may be difficult for all participants to attain without supervision ā notwithstanding the risk for older adults exercising at peak intensity (Noakes, Myburgh, & Schall, 1990). Furthermore, some older adults may be uncomfortable using these wearable technologies.
Choices about measurement are important, because these different ways of measuring exercise are not interchangeable: Although within-subjects studies assessing subjective and objective estimates of exercise generally find positive correlations between subjective and objective measures, these correlations are usually weak (Jurca et al., 2005; Mailey et al., 2010; Moy, Scragg, McLean, & Carr, 2008; Zlatar et al., 2015).
How much physical exercise is required for maximal effectiveness?
On this topic, several questions are intermingled: How often should exercise sessions occur, how long should sessions be, over how long a term, and at what intensity, to produce benefits for cognition in aging? Getting these questions straight is paramount: Exercise programs that are too low in intensity or too brief may fall short of showing any cognitive benefits, whereas programs that are too arduous or too long-lasting may increase drop-out, especially of participants with poorer initial physical and/or cognitive functioning.
Is more frequent exercise better? The old adage that āmore is betterā might fit with our intuitions, but might not actually be true (or, the answer might depend on whether by āmoreā you mean session schedule [how often], session length [how long], program length [over how long a term], or intensity of activity). For example, in their classic meta-analysis of exercise interventions for cognitive aging, Colcombe and Kramer (2003) found that programs that lasted more than six months were more effective, implying that more sessions are better. However, the āmore is betterā maxim did not apply to the length of each session: The ideal session length was between 30 and 45 minutes, with longer sessions not conferring as great a benefit to cognition.
Is more intense exercise better than less intense? Perhaps going against oneās intuitions, many interventions and longitudinal studies have suggested that moderate-intensity exercise is often as good as, and sometimes even better than, high-intensity exercise (Blondell et al., 2014; Colcombe & Kramer, 2003; Etnier et al., 1997; Gates, Singh, Sachdev, & Valenzuela, 2013; Hindin & Zelinski, 2012; Kelly et al., 2014b; Lindwall, Rennemark, & Berggren, 2008; Lindwall, Rennemark, Halling, Berglund, & Hassmen, 2007; Smith et al., 2010; Snowden et al., 2011; Sofi et al., 2011; Yaffe, Barnes, Nevitt, Lui, & Covinsky, 2001). Two points seem pertinent, however.
First, the debate over different effects of low-, moderate-, and high-intensity exercise is complicated by the fact that intensity is particularly difficult to measure and agree on. This is especially the case in longitudinal and epidemiological studies, which often rely on self-report. Some researchers have chosen to categorize different activities a priori as being high intensity (e.g., jogging, basketball) versus low intensity (e.g., walking, gardening), but such categorization risks confounding type of activity with intensity. Other researchers (e.g., Hillman et al., 2006) have relied on people reporting how often they break into a sweat as a proxy for intensity. This approach is hindered, however, by the weak relationship between sweating and physical exertion (Buono & Sjoholm, 1988). Moreover, sweating tends to decrease in aging (Foster, Ellis, Dore, Exton-Smith, & Weiner, 1976). Alternative measures of the intensity of physical activity include multiplying the estimated amount of time taking part in each activity by the amount of energy presumably expended during that activity (e.g., van Gelder et al., 2004), but in many cases these still rely on participantsā self-reports.
Second, an interesting recent development is the introduction of very high intensity exercise for very brief periods (e.g., 90% maximum heart rate for only 1 minute). Referred to as high intensity training (HIT) or high intensity interval training (HIIT), this usually occurs under the supervision of medical personnel using objective physiological exertion measures. Virtually nothing has been published on the c...