Part I
Setting the scene for randomized controlled trials in education
Part I sets the context for considering RCTs in education, locating them within the evidence-based movement and the âwhat worksâ agenda that has been sweeping across the world for over two decades, i.e. sufficient time for its wrinkles, conditions and problems to be identified and addressed. The argument presented is that, when considering the value of RCTs in education, we should rid ourselves of the belief that âwhat worksâ is straightforward and unproblematic. Rather, the opposite is the case. âWhat worksâ is not only a matter of empirical demonstration; it is a deliberative, value-laden, value-rich, value-saturated matter, and it is unlikely that all parties involved will agree as to whether something âworksâ or does not âworkâ. Indeed, the very definitions of âworksâ and âevidenceâ are contested and unclear.
Chapter 1 opens up the field by raising an initial set of questions in considering what constitutes acceptable evidence and what constitutes âworksâ in the âwhat worksâ debate. These indicate that âwhat worksâ contains important sub-questions and sub-issues, and that, even if satisfactory answers can be provided to these questions, this is still insufficient, as predictability, transferability, generalizability and trustworthiness are questionable.
Chapter 2 identifies definitional problems in âwhat worksâ, what is âevidenceâ, what exactly is the âwhatâ in âwhat worksâ, how do we âknowâ or judge if something âworksâ or does not âworkâ, and what constitutes acceptable evidence. The chapter argues that evidence is not neutral; rather it is that which is brought forward to make a case beyond a reasonable doubt. The chapter draws on legal analogies in presenting evidence, and sets out demanding criteria for evidence that can be applied to âevidenceâ in education. This, it is argued, enables rigour to be demonstrated in judging âwhat worksâ and what are appropriate indicators of this. In turn, this raises challenges for educators trying to disentangle âwhat worksâ in a complex, multivariate, multidimensional world, and assessing and evaluating âwhat worksâ in a way that is faithful to such complexity. The problem is compounded because almost anything has been shown by research evidence to âworkâ, so the user of âevidenceâ has to be able to discriminate between different quality in, and uses of, research findings. This returns to issues raised in Chapter 1, of the need to address many questions in deciding whether something âworksâ, as there is no single, universal yardstick for objective measures. Whether something âworksâ depends on who is judging and on whose evidence, in what circumstances and conditions, and so on; it is a human activity, not a mechanistic formula. Whilst the power and integrity of evidence are important, the chapter opens up a vast array of questions and concerns about a wider range of topics and elements concerning research evidence.
Chapter 3 sets the context for discussions of causality in RCTs, and, in doing so, makes a case for requiring a much more nuanced, complex, cautious and less naĂŻve approach to understanding causality than RCTs provide. To the argument that the strength of RCTs is precisely because they show causality, other factors having been controlled out or simply over-ridden, the chapter argues that this claim is not as simple as RCTs would have it be, and that causality as espoused in RCTs is much more problematic and uncertain, not least in mistaking as ânoiseâ what is, in fact, part of the âsignalâ.
Having set the scene for considerations of âevidenceâ and its limitations, âwhat worksâ and questions against its apparent simplicity, and the importance of attending to causality, Part II moves specifically to RCTs in education.
Chapter 1
Questioning evidence of âwhat worksâ in educational research
Introduction and overview
This chapter argues that:
- Findings concerning âwhat worksâ are equivocal rather than certain, often lacking predictability and generalizability.
- RCTs are only one of a vast range of types, methodologies and methods of research in education, and that to elevate them above others is misconceived when one applies the âfitness for purposeâ criterion.
- Defining âevidenceâ, âwhat worksâ, what is the âwhatâ in âwhat worksâ, and what âworksâ means, is open to very different interpretations.
- Understanding these different interpretations raises many questions that call out all-too-easy definitions and answers.
- A sober, critical, sceptical view of âevidenceâ and âwhat worksâ is a caution against over-simplistic assertions of what research shows and what can be taken from research in education.
This lays the ground for interrogating the appeal of RCTs in Part II.
The limits of what âresearch showsâ
âThe research said that this would work. I tried it. It didnât.â
It would be hard to justify not having evidence inform what we do in education. We may think that what we do is the best way, but relying on intuition and experience may be insufficient; we may be recycling poor practice whilst earnestly believing that it is good practice because we have been doing it for years. As Cain (2019) remarks, reliable research is better than alternatives such as trial and error, or, indeed personal hunches (p. 10). In a climate of accountability, practices should be informed by evidence rather than its lack. Indeed, Didau (2015), Gorard et al. (2018) and Major and Higgins (2019) note that many practices are not informed by evidence and continue in spite of evidence of their ineffectiveness and harm.
The move towards evidence-based education in judging âwhat worksâ appears unstoppable. However, what constitutes evidence, and evidence of âwhatâ is not always clear. There is no âone size fits allâ in considering what kind of evidence is important, nor how it is gained and used. It may come from a survey, a test, an observation, an RCT, from the views and wisdom of acknowledged experts, and so on. Fitness for purpose is paramount, and evidence must be actionable and useful. This book, whilst applauding the moves made towards promoting evidence-informed education in principle, argues that, in the âwhat worksâ agenda, considerable caution must be exercised in considering the nature and trustworthiness of the evidence, in making the connection between âevidenceâ and âwhat worksâ, and in moving from evidence to practice. And evidence is only one element in the âwhat worksâ agenda, nor does it provide conclusive, eternal and incontestable truths, but it helps. Research-informed decision making, practice and policy are surely better than their non-informed counterparts.
We have to give the lie to the emphasis placed on large-scale, putatively disinterested and objective âevidenceâ and the privileging of RCTs as fitting âevidenceâ for âwhat worksâ as the sole or main path to salvation in improving education. This book calls out those whose preoccupation with certain kinds of âevidenceâ renders them partially sighted or blind to the benefits of other kinds of evidence in yielding truths in a complex world or whose all-too-easy dismissal of values-based teaching and the professional wisdom of experienced practitioners is accompanied by a reliance on âevidenceâ whose basis is shaky, non-generalizable and subject to personal, selective preference. For sure, RCTs have their place, but there is no reason why they should sit alone on the throne of what is considered to be suitable evidence.
Moves towards a âwhat worksâ agenda in education, which is intended to be evidence-based and/or evidence-informed, has been sweeping across the world almost without hindrance. It is the new orthodoxy, not least as it serves so many agendas. It purports to have a benevolent intent, improving education and avoiding relying on untried interventions in education; it furthers outcomes-based approaches, accountability and performance metrics; and it seeks to draw on âbest evidenceâ and research. Evidence-informed practice uses the best evidence to achieve desired goals or outcomes and, indeed, to prevent undesirable outcomes.
This immediately opens up the debate, as the term âbestâ is not an empirical matter; it is a normative, moral and ethical matter, requiring judgement, statements of values and deliberation. It moves beyond pragmatics.
That education should be informed by cumulative and progressive research is surely beyond question. Like medicine, the ongoing accumulation of research evidence can make great strides forward in improving practice. On the other hand, evidence-based practice has received criticism for: misrepresenting the nature of the debate on what schools and educational institutions should be doing; neglecting the inclusion of values in, purposes of, and justifications for education and its decision making; narrowing the curriculum to that which is tested; making contestable assumptions about transferability; and over-simplifying what is, essentially, a highly complex, variable-dense, multi-faceted, multi-layered and multifactorial situation in classrooms and schools. Evidence-based practice has been criticized for its amoral, pragmatist approach to education, for over-simplifying educational discourses on how to improve education, for adopting too narrow, even singular, an approach to what constitutes âevidenceâ, for accepting all too easily what count as research findings, for overstating the generalizability of research findings, and for excluding a multitude of factors in addressing the âwhat worksâ agenda. It has done little to close the gap between research and practice, and between research and policy making.
Whilst it would be difficult to support the view that educational practice should not be evidence-informed, and whilst âwhat worksâ should be a worthy goal of education, it is the brand or type of âwhat worksâ and âevidenceâ that is often considered to be important. High quality research and evidence are essential â of the essence â if we are to ensure that the often once-in-a-lifetime experience of education is to be maximized. But âevidenceâ is a slippery term, as it includes more than empirical data and Shakespeareâs âocular proofâ of observed phenomena; rather it engages issues of worth, values, morals, purposes, justifications, opinions, judgements, contestation and questioning, and emanating from many sources. It requires the status, credibility, rigour, scope, worth and applicability of âevidenceâ and âwhat worksâ to be interrogated. âWhat worksâ is as much a matter of values and judgement as it is of empirical outcomes; it is a normative, not simply an empirical matter (Sanderson, 2010).
Even though the agendas and time frames of researchers and policy makers often collide rather than coincide, or research bears little direct relation or relevance to policy formation and decision making, nevertheless policy should be expertly informed. Governments and policy makers are charged with the responsibility of examining issues in depth. In an age of evidence-based everything, educationists have a right to have policy decisions informed by more than ideology. Answers to questions such as âwhat evidence?â, âevidence of what?â and âwhose evidence?â are essential. The claim that evidence shows such-and-such is almost always questionable, as it is not always clear what, exactly, the evidence is âevidence ofâ, and this raises issues of validity and fairness. Solution-focused and strategic policy making, in all areas of educational policy making, should be informed by the best evidence available. Ask yourselves: âis it happening?â
Best practice in education should be informed by evidence. Research evidence is a key means of updating and benchmarking practice, improving practice, for practitioners of all types and persuasions, not simply for a coterie of academics or like-minded educationists. High quality research should make a difference; it should open minds. As with policy making, educational practice should be informed by the best evidence available. Again, ask yourselves: âis it happening?â
The reader wishing to find out âwhat worksâ is all-too-easily swamped by materials from a range of organizations with a benevolent intent in helping educationists and practitioners in many spheres of education to use âevidenceâ in promoting best practice; a worthy intention. For example: the What Works Clearinghouse; the Education Endowment Foundation together with its Teaching and Learning Toolkit; the Campbell Collaboration; the Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI-Centre); the Comprehensive School Reform Quality Center; the Best Evidence Encyclopedia; the Coalition for Evidence-Based Policy; the What Works initiative and What Works Centres; the Evidence Based Education organization; the York University Centre for Reviews and Dissemination; the Alliance for Useful Evidence (Nesta); the Centre for Evidence Informed Policy and Practice; and countless systematic reviews, research syntheses and meta-analyses, some of which date back well before the advent of the âwhat worksâ movement (e.g. the journal Review of Educational Research).
However, as Nesta (2016) remarks: evidence ârarely speaks for itselfâ (p. 4); it is mediated and aggregated through a host of sources, parties and affiliations. Hence this book is cautionary. We risk all too easily slipping into simplistic, if attractive, conclusions about âwhat worksâ and what constitutes usable âevidenceâ. Rather than rushing headlong into accepting âevidenceâ as indicating âwhat worksâ, it is important, for safeguarding high quality education, to address a range of questions, and the list is long, for example:
Definitions
- What does âworksâ mean?
- What is the âwhatâ in âwhat worksâ?
- What constitutes âevidenceâ?
- Whose evidence?
- Is an opinion âevidenceâ, and, if so whose opinions count?
- What is âgood evidenceâ for âwhat worksâ?
- Compared with what do we judge if something âworksâ? Is it any better or worse than other âtreatmentsâ/methods?
Validity and reliability
- Evidence of what, exactly?
- Given that âwhat worksâ should be judged in terms of the stated purposes of a project or intervention, how sensible or possible is it to separate out those purposes from the whole gamut of purposes of education, intended or not, that are served by a particular intervention?
- How to address the complexity, multi-dimensionality and multi-valency of constituents of defining âwhat worksâ?
- How can we be sure that something works every time or most of the time?
- When is evidence enough to be deemed secure?
- Is evidence, per se enough on which to base decisions about what to do?
- What to do with research that shows that something âworksâ sometimes but not always?
- How secure are the findings? Have they been corroborated?
- Over how many occasions and contexts must something âworkâ for it to be claimed that it âworksâ (e.g. a joke works well once but dies if repeated; a student may obtain a fluke high or low score once)? When is âenoughâ really enough?
- When does something âworkâ, and for how long must it work before it is deemed to be successful?
- After how long must something âworkâ?
- When to assess whether something âworksâ (assessing in too short a time or too long a time can bring unreliability or invalidity)?
- What variables and factors were not included in the research, hence were not controlled (e.g. teacher enthusiasm and expertise)?
- What do the terms used in research mean to different participants and readers (e.g. âdirect instructionâ, âcollaborative learningâ, âcognitive demandâ), often being very general in nature?
- What significance is accorded to the concepts and practices in question, as these vary from culture to culture and context to context (âsignificanceâ is not the same as âpresenceâ, e.g. the âBig Fiveâ personality traits may be important in one culture but may be unimportant in another culture, even if they are present)?
- How acceptable, useful or naĂŻve is it to reduce the dynamic interpersonal complexity of teaching and learning to a single number (e.g. effect size) and to invest so much in a single figure or a âyesâ or ânoâ (âyes, it worksâ; âno, it doesnâtâ)?
- How valid and reliable are proxy variables for matters which are not directly observable (e.g. intelligence; understanding; learning)?
- What kinds of data and methodologies are required to understand âwhat worksâ?
- What are the limits and possibilities of different methodologies and methods in providing useful research evidence of âwhat worksâ?
Judgements and conditionality
- Under what conditions does something âworkâ, ânot workâ etc.?
- In whose terms is âwhat worksâ being judged (âwhat worksâ in one personâs eyes does not work in another personâs eyes)? There is no one version or judgement of âwhat worksâ.
- How to take account of the point that âwhat worksâ is a matter of judgement rather than data, and that this judgement is imbued with moral, values-related and ethical co...