The concept of evidence-based policy (EBP) and the ethical challenges it poses are explored. First a âstandardâ model of EBP is outlined and implicit assumptions it makes about the nature of evidence and the policy-making process are examined. Debates about how best to implement EBP in the criminal justice sector are investigated before various critiques of EBP are discussed. These start with methodological debates relating to the generation of empirical evidence and how to review evidence, but a set of ethical concerns soon emerge. These are not abstract, academic debates. Using the example of different approaches to offender rehabilitation the practical dimensions of such ethical concerns are demonstrated using offender rehabilitation as an example.
Key points
1. EBP is a well-developed concept in the UK criminal justice system.
2. In its âtraditionalâ form EBP gives precedence to experimental evaluation designs and the use of systematic reviews to interrogate the evidence base.
3. EBP draws heavily on the concept of evidence-based medicine but this raises questions about the transfer of ideas and methodologies to questions of social policy.
4. Ethical challenges to EBP often focus on the use of experimental research designs and whether âscientificâ approaches to social research are appropriate when addressing social issues.
5. A discussion of the relationship between evidence and practice in offender rehabilitation illustrates some of the questions that an overly âscientificâ approach to EBP raises.
What is evidence-based policy?
Defining evidence-based policy is not straightforward and definitions are contested. The UK government has defined evidence-based policy as being:
based upon the best available evidence from a wide range of sources; all key stakeholders are involved at an early stage and throughout the policyâs development. All relevant evidence, including that from specialists, is available in an accessible and meaningful form to policy makers.
(Cabinet Office 1999: 73)
This approach is seen as integral to modern, professional policy making (Bullock et al. 2001). Davies suggests that it:
helps people make well informed decisions about policies, programmes and projects by putting the best available evidence from research at the heart of policy development and implementation.
(Davies 1999 quoted in Davies 2004: 4)
and contrasts it with opinion-based policy, which
relies heavily on either the selective use of evidence (e.g. on single studies irrespective of quality) or on the untested views of individuals or groups, often inspired by ideological standpoints, prejudices, or speculative conjecture.
(Davies 2004)
Straight away this presents us with a challenge because, inevitably, not all research is of the same quality and therefore we may not want to treat all research equally when using it to inform policy decisions (Davies 2004). If EBP is to be a reality it will require more systematic approaches to searching for and assessing the methodological quality of evidence so that policy-makers can achieve a balanced understanding of the research evidence and of its strengths and weaknesses (Davies 2004).
Thus, policy-makers with an interest in evidence-based decision-making have turned increasingly to systematic reviews of the results of previous inquiries in the relevant policy domain (Pawson 2002). Systematic reviews consider existing research literature on a topic based upon (Government Social Research Unit 2007a):
⢠comprehensive searching of print, electronic and unpublished sources;
⢠explicit search procedures for identifying the available literature;
⢠explicit criteria for distinguishing the quality of research studies; and
⢠presenting available evidence in a way that makes the quality of the evidence upon which the review is based transparent.
This systematic approach to reviewing the existing evidence is intended to overcome some of the problems inherent in traditional, ânarrativeâ literature reviews which typically suffer from several limitations such as âselection biasâ and âpublication biasâ, provide few details of the procedures by which the reviewed literature has been identified and appraised and are often unclear how the conclusions of narrative reviews follow from the evidence presented (ibid.).
Systematic reviews often include a meta-analysis. This is a statistical method for combining and summarising the results of studies in a systematic review. Once relevant studies have been identified, grouped together according to similar intervention characteristics and screened for methodological rigour the study outcomes are identified and effect sizes computed. Corrections are applied for small sample sizes and the mean effect of each class of intervention calculated (Government Social Research Unit 2007a).
Evidence-based medicine
To some extent the âpresent fascinationâ with evidence-based policy was a response to contemporary interest in evidence-based medicine (Young 2011: 20) and the influence of evidence-based medicine is clear within experimental criminology (Sherman 2009). Over a number of decades there has been a growing awareness in the medical profession that medical professionals, despite acting on the best of intentions, do not always make good decisions (Evans et al. 2011). The history of medicine contains many examples of interventions that, at the time were thought to be effective, but turned out not to be or that were effective, but were used inappropriately. Well-known examples include the drug Thalidomide (Evans et al. 2011), the over-prescription of antibiotics and the advice to sleep babies on their fronts (Chalmers 2003).
In medicine the solution to these problems has been two-fold. First there is a need for a âfair testâ of the treatment, secondly, systematic reviews, normally including meta-analysis are required. In evidence-based medicine the âfair testâ is usually a randomised control trial (RCT). The RCT is not just the âgold standardâ for evaluating the effect of a treatment, it is the default design, and systematic reviews in medicine are heavily skewed towards evidence generated through RCTs. The design of an RCT is a relatively straightforward one. The simplest randomised experiment involves random allocation of units (these may be people, classrooms, neighbourhoods, etc.) to two different conditions and a post-test assessment of units. In the simplest experimental design the control group gets nothing. In a clinical trial this might involve giving one group of patients a treatment and the other group a âplaceboâ. Often the outcome variable is measured before and after the intervention for both the intervention and control group. Shadish et al. (2002: 260) go as far as to suggest that in most cases the absence of a pre-test measure is âriskyâ because it limits the opportunity for statistical analysis of pre-existing differences (Rossi et al. 2004). Pre-test can also help determine how much gain an intervention has made (Rossi et al. 2004).
In the pursuit of evidence-based medicine the systematic review has become a vital tool for policy-makers and practitioners and the creation of systematic reviews an important element of medical research, policy and practice, perhaps âone of the most important innovations in medicine over the past 30 yearsâ (Goldacre 2011: xi).
The scale and reach of this endeavour is perhaps clearest when one considers the Cochrane Collaboration â an international collaboration of more than 31,000 people across over 100 countries which has so far published over 6,000 systematic reviews (Cochrane Reviews) in an open-access, on-line library1.
(Quasi) experimental-based policy
The model of evidence-based policy promoted by government tends to reflect the approach implied in evidence-based medicine. In this model experiments are the gold standard of evaluation and to some extent, the term âevidence-based policyâ has become understood to mean experimental-based policy (Sampson 2010). Thus, the Cabinet Office has consistently promoted the greater use of experiments in social policy making (see for instance, Government Social Research Unit 2007b) and in recent guidance on âDeveloping Policy with Randomised Controlled Trials (RCT)â argued that:
RCTs are the best way of determining whether a policy is working (Haynes et al. 2012: 4).
Arguments for the superiority of RCTs are often found in the social and behavioural sciences (Weisburd et al. 2001; Lum and Yang, 2005). This is because, if implemented properly, they have the highest possible level of internal validity. Internal validity refers to âthe correctness of the key question about whether the intervention really did cause a change in the outcomeâ (Farrington 2003: 52 and described in detail in Shadish et al. 2002). As in medicine, the preference however, is not to rely on a single RCT, but to instead conduct systematic reviews and, where possible meta-analyses. As Sherman (2009) notes, a meta-analysis can sometimes be particularly illuminating in a social policy context where RCTs are often based on small samples and a meta-analysis can show the effect of an intervention when the effect was not apparent from looking at the results of individual, small studies. However, when compared to healthcare far fewer randomised field trials take place.
It is not always possible or desirable to implement an RCT and sometimes quasi-experiments provide a better solution. These have treatments, outcome measures and experimental units, but not random assignment so comparisons are between non-equivalent groups (Cook and Campbell 1979). In essence, as the design moves further away from the âgold standardâ of a social experiment, the less strong the internal validity assured by the design and the less certainty there is that any effects observed can be attributed to the intervention being studied.
There are a large number of quasi-experimental designs, too numerous, and, in some cases too complex to describe in detail here. Probably the most common quasi-experiment is the âuntreated control group design with dependent pre-test and post-test samplesâ often called the ânon-equivalent comparison group designâ (Shadish et al. 2002). The design is based on an intervention and control group that are not created through random assignment, hence they are non-equivalent. Data is collected on the outcome measure (dependent variable) both before and after treatment for both groups. This design allows some of the threats to internal validity to be avoided. The Ministry of Justice currently runs a Justice Data Lab which uses a form of non-equivalent comparison group design. Specifically, the Lab favours a matched pairs design using Propensity Score Matching to match cases in the intervention and control groups.
Evidence-based policy in criminal justice
Within the criminal justice system the concept of evidence-based policy is now well-embedded. When New Labour came to power in 1997 it made a clear commitment to evidence-based policy making across government, summed up by Tony Blair when he declared âwhat matters is what worksâ2. This was particularly apparent in crime and criminal justice through such initiatives as the ÂŁ400 million Crime Reduction Programme, set up by the Home Office to fund crime reduction initiatives and in which evaluation and then the dissemination and use of research-based knowledge were intended to be central (Maguire 2004).
Since 1997, and mirroring developments in the US systematic reviews have become increasingly important in crime and criminal justice policy-making. A key milestone was the publication, in the US, of ...