Introduction
Evaluation is a broad concept and one that is sometimes difficult to distinguish both from other types of research and from related practices such as monitoring, performance management and audit. We start this chapter by looking at the various definitions of evaluation, distinguishing those that concentrate on the purpose of evaluation, the methods used in evaluation, and the significance of judgements of value in evaluation. We consider each of these briefly before adopting a definition of evaluation based on judgements of value. This is a definition that emphasises the political dimension of evaluation â a key distinguishing feature that will help us differentiate it from other types of research activity.
Evaluation covers a range of activity types. People commonly talk and write about formative and summative evaluation, process (implementation) and impact (outcome) evaluation, and economic evaluation and theory-led evaluation. All of these terms raise questions about the nature and role of evaluation which we will address at various points in the book. In this chapter we will look at types and models of evaluation before discussing some key trends that have shaped the world of evaluation over recent decades.
The final section of this chapter looks briefly at the history of evaluation. It aims to introduce readers new to evaluation to some of the âbigâ ideas that have helped shape the sector and are still debated today. In particular the flux between âscientificâ and ânaturalisticâ approaches to evaluation is discussed. Many of the ideas raised in this chapter will be developed in detail in later chapters.
Defining evaluation
Mark et al. (2006) distinguish everyday informal evaluation (How good was breakfast at the restaurant? How did the meeting with the client go?) from systematic evaluation, which they define as:
a social and politicized practice that nonetheless aspires to some position of impartiality or fairness, so that evaluation can contribute meaningfully to the well-being of people in that specific context and beyond. (Mark et al. 2006: 5â6)
They identify three groups of evaluation definitions which concentrate on the purpose of evaluation, the methods used, and the importance of judgements of value in evaluation. We will consider each of these in turn.
Defining evaluation according to purpose
Mark et al. (2006) identify a group of definitions that concentrate on the purpose of evaluation, typically providing information for policymaking or programme improvement. Their example is Pattonâs definition:
Program evaluation is the systematic collection of information about the activities, characteristics, and outcomes of programs to make judgments about the program, improve program effectiveness, and/or inform decisions about future programming. (Patton 1997: 23, emphasis added)
A note on terminology: policy, programme or project?
At this point it is useful to note that in the literature on evaluation we often come across references to projects, programmes and policies. Building on Eggersâ work, Palfrey et al. (2012) argue that it is necessary to distinguish between these three evaluation subjects because they might offer the evaluator different opportunities to contribute to decision making. They suggest that:
- a project is a planned activity aimed at achieving specified goals within a prescribed period
- a programme is a set of separate planned activities unified into a coherent group
- a policy is a statement of how an organisation or government would respond to particular eventualities or situations according to its agreed values of principles
In this book, while we accept that projects, programmes and policies present different evaluation opportunities and challenges, for the sake of brevity our default position will be to refer to âprogrammeâ evaluation unless there is a need to distinguish one from another.
Defining evaluation according to method
Another group of evaluation definitions identified by Mark et al. (2006) outline evaluation in terms of methods. An example comes from Rossi et al.:
Program evaluation is the use of social research methods to systematically investigate the effectiveness of social intervention programs in ways that are adapted to their political and organizational environments and are designed to inform social action to improve social conditions. (Rossi et al. 2004: 16, emphasis added)
Defining evaluation in terms of methods can be potentially helpful in distinguishing it from similar practices such as monitoring, performance management, auditing and accreditation. However, in turn this raises questions about what distinguishes evaluation from research (Palfrey et al. 2012).
Evaluation distinguished from monitoring, performance management, audit and accreditation
Monitoring, performance indicators (PIs) and broader performance management process have proliferated across the public, private and, increasingly, the not-for-profit sectors in the last few decades. We can link this proliferation to the development of âNew Public Managementâ. Often associated with reforms to the public sector that were introduced during the administrations of Prime Minister Margaret Thatcher in the UK and President Ronald Reagan in the US, New Public Management (NPM) involved structural changes to the public sector and the introduction of business methods into government (Hill and Hupe 2014), as well as practices such as shrinking the size of the state so that government reduced its service delivery capacity, contracting out government services to the private and not-for-profit sectors, a greater emphasis on the choice available to service users, and the creation of new publicâprivate vehicles to deliver services (Hill and Hupe 2014). The adoption of business practices, a greater focus on managing by outputs and the increased âmarketisationâ associated with NPM contributed to the proliferation of performance management measures.
Palfrey et al. (2012: 19, citing Carter et al. 1992) suggest that PIs are very useful as âtin openersâ because they help us clarify questions about performance. In this sense, they are a valuable starting point for evaluation, but are not a substitute for evaluation that incorporates the use of research methods (Palfrey et al. 2012).
Audit has also proliferated in the UK and US. Power (1997) charts a move from traditional audits that focus on financial probity to audits that ask broader questions about organisational performance and âValue for Moneyâ (VFM) (Palfrey et al. 2012). However, deciding on what matters in VFM involves value judgements (ibid.) and by imposing values audits can have unintended and dysfunctional consequences for the audited organisation. Evaluation does not avoid such value judgements but social scientists recognise their importance and have developed a number of strategies to avoid or incorporate them, depending upon the social science tradition they come from.
Accreditation has been used widely in the UK public sector as a strategy for setting standards for the performance of organisations and often starts with self-evaluation (Palfrey et al. 2012). Well-known examples in the UK include the use of âTrustâ status in sectors such as health and âInvestors in Peopleâ â a government agency that accredits organisations that demonstrate good practice in workforce management (ibid.).
If the âresearchâ component is what distinguishes evaluation from practices such as PIs, audits and accreditation, what is it then that differentiates âevaluationâ from âresearchâ?
Distinguishing evaluation from research
The distinction between evaluation and research is discussed by Lincoln and Guba (1986) who note that both are forms of âdisciplined inquiryâ and use many of the same tools or methods. However, having such shared methods does not make them one and the same thing. Lincoln and Guba argue that âto assert identity or similarity on the basis of common methods would be analogous to saying that carpenters, electricians, and plumbers do the same thing because their tool kits all contain hammers, saws, wrenches, and screwdriversâ (1986: 547). The key distinction between evaluation and other types of research is the importance of values of judgement in evaluation.
Defining evaluation according to judgements of value
This brings us to the third group of definitions identified by Mark et al. (2006), which concentrate on the function evaluation serves and assume that evaluation involves judgements of value. As an example of a definition of evaluation based on judgements of value, Mark et al. (2006) cite Scrivenâs definition:
Evaluation refers to the process of determining the merit, worth, or value of something, or the product of that process ... The evaluation process normally involves some identification of relevant standards of merit, worth, or value; some investigation of the performance of the evaluands on these standards; and some integration or synthesis of the results to achieve an overall evaluation or set of associated evaluations. (Scriven 1991: 139; emphasis added)
Lincoln and Guba (1986), when considering what makes evaluation different from research, argue that latter is undertaken to resolve a problem while evaluation is undertaken to establish value. They also define research as: âa type of disciplined inquiry undertaken to resolve some problem in order to achieve understanding or to facilitate actionâ (1986: 549), whereas evaluation is defined as:
a type of disciplined inquiry undertaken to determine the value (merit and/or worth) of some entity â the evaluand â such as a treatment, program, facility, performance, and the line â in order to improve or refine the evaluand (formative evaluation) or to assess its impact (summative evaluation). (1986: 550)
This difference, which Lincoln and Guba describe as âmonumentalâ, also leads to what they see as a key distinction in the products that result. Whereas research is typically adequately served by a technical report, this by itself is rarely sufficient for an evaluation if it has to meet the needs of, and communicate with, its various audiences (Lincoln and Guba 1986).1
Our preferred definition of evaluation
The many definitions of evaluation suggest that it is not easy to pin down the concept. Indeed, some observers have argued that this is a pointless task. For example, some evaluators reject objective âscientificâ approaches to evaluation, arguing instead that because the human world is socially constructed evaluation is itself a social construct. There are multiple social constructs and therefore from this relativist point of view there is no right way to define evaluation. Thus, by the end of the decade, Guba and Lincoln were arguing: âThere is no answer to the question, âBut what is evaluation really?â and there is no point in asking itâ (1989: 21).
We recognise the importance of purpose and methods in defining evaluation, but also take the view that what is crucial for defining evaluation is the emphasis on a process of determining the merit, worth or value of something, along the lines suggested by Scriven.
Distinguishing evaluation from research as a practice designed to establish the value of an entity has notable implications that will resurface throughout this book. If we accept that the purpose of evaluation is to determine the value of the entity being evaluated, and that the products of an evaluation are designed to improve the thing being evaluated or to assess its impact, this has important repercussions for evaluation and for evaluators. If we return to the very first definition of evaluation that we considered, i.e. Mark et al.âs (2006) view of evaluation as a âpoliticized practice that nonetheless aspires to some position of impartiality or fairnessâ, we can start to see the potential tensions in a practice that is at once politicised but also aspires to impartiality or fairness.
Some would ...