1 Introduction to quantitative longitudinal data
Tempus et maris aestus neminem expectant (Time and tide wait for no man)
The earliest known record is possibly from St. Marher, 1225
Introduction
The subject of this book is quantitative longitudinal data analysis, and it focuses on data that are collected in large-scale social surveys. The most universal definition of longitudinal social science data is any data that have a temporal (i.e. time) dimension. A rudimentary distinction is routinely made between cross-sectional designs, where data are collected at only one point in time, and longitudinal designs, where information is collected from the same units on multiple occasions. Temporal data can be collected in cross-sectional designs, and it is mildly misleading to assume that longitudinal analyses can never be undertaken with cross-sectional data.
Longitudinal social science data are critical to understanding social change over time. They are also critical to understanding social stability, but this is often overlooked. The analysis of quantitative longitudinal data has made a number of important intellectual contributions. Here are three examples. First, the analysis of longitudinal data on British children revealed a link between smoking during pregnancy and subsequent child development (see Butler and Goldstein, 1973). Second, the Whitehall Study revealed that inequalities in health were not limited to the health consequences of poverty and demonstrated that occupational hierarchy was intimately related to health and life chances (Marmot and Brunner, 2005; Marmot, 2004). Third, in the early 1990s much of the analyses of income in Britain focused on poverty and social inequality. Inequality and poverty rates then flattened off, and it appeared that there was little or no change in the income distribution from one year to the next. The analysis of longitudinal household data revealed that underneath the apparent stability there was a hidden flux. Household incomes fluctuated between one year and the next, and there was substantial turnover in the membership of the low-income population. This was well known in countries like the United States and Germany, which already had longitudinal household surveys. These new findings made a contribution to the understanding of poverty and to the development of new economic theory. They also influenced the Labour government’s welfare reforms. The concept that household poverty is dynamic now influences the way living standards are measured and monitored in Britain (see Jenkins, 2000).
Many social science research questions can be adequately answered using cross-sectional data. Most social science research projects can be improved by incorporating suitable longitudinal data. Some social science research questions can only be sensibly answered using longitudinal data. The following chapter introduces concepts associated with analysing large-scale quantitative social science data and highlights the rich variety of data resources that are available to researchers. In this book we focus upon observational social survey designs, but most of the issues can be directly extended to other forms of data, for example administrative data or experimental designs.
Cross-sectional social surveys
Cross-sectional social surveys are often overlooked as a source of longitudinal information. The majority of the data in a cross-sectional social survey will relate to only one point in time. Cross-sectional social surveys sometimes collect a small amount of information that has a temporal dimension. For example, a cross-sectional survey might collect information on a respondent’s current occupation and also on their father’s occupation when they themselves were aged 14. While the information is collected at only one point in time, the measures still contain a temporal dimension and can inform analyses of social change (i.e. intergenerational occupational mobility). Similarly a respondent in a cross-sectional survey might be asked about their current job and also about their first job. These data could effectively be used to analyse intragenerational occupational change. The point here is that while the survey is cross-sectional, the measures collected contain some temporal information.
Many cross-sectional surveys are carried out on multiple occasions. These surveys are not based on repeated contacts with the same individuals (or households), but they offer the possibility of comparing similar data for different points in time. Combining data from repeated cross-sectional surveys is a particularly effective way to study trends over time. There are numerous large-scale repeated cross-sectional surveys. A notable example is the Labour Force Survey, which has been carried out in the UK since 1979 (and for which a similar survey is undertaken in many other countries). Further examples are the General Household Survey, which commenced in the UK in 1971, and the Family Expenditure Survey, which dates back to the 1950s. The US Current Population Survey is another example of a repeated cross-sectional survey.
For many social science analyses, cross-sectional surveys provide highly appropriate temporal data on social patterns. Repeated cross-sectional surveys offer opportunities to analyse macro-level trends, sometimes extending over lengthy periods of time. They do not, however, offer any insights into micro (i.e. individual)-level changes over time. This is because cross-sectional surveys generally do not include information that links up individuals in the surveys at different time points. We will return to the analysis of multiple cross-sectional surveys for quantitative longitudinal social science data analysis in Chapter 3.
Longitudinal social surveys with repeated contacts
Central to the collection of longitudinal social survey data is the concept of a panel. A panel is a sample of respondents who are contacted and surveyed on multiple occasions. The idea of researching a panel of respondents was pioneered by sociologist Paul Lazarsfeld for opinion research in the 1930s (see Lazarsfeld and Fiske, 1938). The sample unit will usually consist of individuals, but could also consist of households, firms, farms, schools or hospitals, or any other unit of social science research interest.
A cohort study is a special type of panel study. It is principally concerned with charting the development, or progress, of a particular group from a certain point in time. A notable example is a birth cohort. This form of study tracks or charts the development of a group of babies born in a particular year and follows them through childhood and into adulthood. In Britain we are richly furnished with birth cohort datasets and we will outline the main studies later.
Not all cohort datasets track individuals from birth. For example, the Youth Cohort Study of England and Wales (YCS) monitored the behaviour and decisions of representative samples of young people aged 16 and upwards as they made the transition from compulsory education to further or higher education, or into the labour marke...