Classroom Assessment and Educational Measurement
eBook - ePub

Classroom Assessment and Educational Measurement

  1. 286 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Classroom Assessment and Educational Measurement

About this book

Classroom Assessment and Educational Measurement explores the ways in which the theory and practice of both educational measurement and the assessment of student learning in classroom settings mutually inform one another. Chapters by assessment and measurement experts consider the nature of classroom assessment information, from student achievement to affective and socio-emotional attributes; how teachers interpret and work with assessment results; and emerging issues in assessment such as digital technologies and diversity/inclusion.

This book uniquely considers the limitations of applying large-scale educational measurement theory to classroom assessment and the adaptations necessary to make this transfer useful. Researchers, graduate students, industry professionals, and policymakers will come away with an essential understanding of how the classroom assessment context is essential to broadening contemporary educational measurement perspectives.

The Open Access version of this book, available at http://www.taylorfrancis.com, has been made available under a Creative Commons Attribution-Non Commercial-No Derivatives 4.0 license.

Trusted byĀ 375,005 students

Access to over 1.5 million titles for a fair monthly price.

Study more efficiently using our study tools.

Information

Publisher
Routledge
Year
2019
Print ISBN
9781138580046
eBook ISBN
9780429017605
Edition
1
Part I
Classroom Assessment Information

1

Perspectives on the Validity of Classroom Assessments

Michael T. Kane and Saskia Wools

DOI: 10.4324/9780429507533-2

This chapter examines how some general principles of validity theory might apply to classroom assessment. In particular, we consider two perspectives on the evaluation of classroom assessments, a functional perspective and a measurement perspective, and we consider how these two perspectives play out in classroom assessments. We suggest that the functional perspective does and should play a larger role in classroom assessment than the measurement perspective.
For all assessments, validity is an important concern (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on Measurement in Education [NCME], 2014). The concept of validity has been developed mainly in the context of summative high-stakes testing, but we will discuss validity for classroom assessment and emphasize the evidence needed for the validation of assessments in this context.
We define validity in terms of the plausibility and appropriateness of the interpretations and uses of assessment results, and therefore validity depends on the requirements inherent in these interpretations and uses. A systematic and effective approach to validation involves three activities: the development of a clear sense of the proposed interpretation and uses of the assessment results; the development (or identification) of an assessment that would be expected to support the intended interpretation and uses; and an evaluation of how well the assessment supports the interpretation and uses.
Cronbach (1988) described two perspectives on the validity of assessments, a measurement perspective and a functional perspective, and we make use of both of these perspectives in evaluating the validity of classroom assessments. The measurement perspective focuses on the accuracy and precision of scores as measures of some construct, and the functional perspective focuses on how well the assessment serves its intended purposes. The measurement perspective and the functional perspective are both relevant to the validation of all assessments, but they focus on different evaluative criteria. We will argue that for classroom assessment, the functional perspective is of central concern, and the measurement perspective plays a supporting role.
We define classroom assessment broadly as involving the collection of information from a variety of sources, with the intention of promoting effective teaching and learning. Classroom assessments take a variety of forms, such as teacher observations of the students in various contexts, interactions with students, quizzes, tests, assignments, and projects. This variety causes classroom assessments to be quite varied in their levels of standardization and formality, but it provides very rich sources of information on student performance, skills, and achievement. Classroom assessments also serve a variety of purposes (e.g., monitoring student progress, diagnosing gaps and problems in learning, motivating students, and informing parents and others about student performance and progress). The main users of these assessments are teachers and students.
The validity of classroom assessments will depend mainly on how well they support the intended uses of the assessment results by teachers and students. Although all potential uses of classroom assessments might be informative to discuss, in this chapter we will focus on the use of the results by teachers for providing feedback to students, evaluating student competencies on particular tasks and over content domains, and diagnosing students’ strengths and weaknesses.
When validity is studied in the context of large-scale high-stakes tests, the technical, or psychometric, characteristics of the tests play a central role. In these high-stakes contexts, those characteristics include, for example, standardization, consistency, and fairness (Cronbach, 1988). Since the results from these standardized tests are used for high-stakes decisions that extend well beyond the context in which the assessment took place, standardization and empirical evidence for consistency over contexts serve an important function in supporting trust in the processes being employed and in the trustworthiness of the results (Porter, 2003).
In a classroom, assessment-based decisions generally involve less far-reaching inferences. Rather, the results are interpreted and used locally. The results need to be practical and useful in fulfilling the main goal of classroom assessment: promoting effective teaching and learning. These decisions are generally less high-stakes than those based on standardized test results, but this does not imply that technical characteristics become irrelevant. An inaccurate conclusion about a student’s ability might not be catastrophic, but it is not likely to be helpful in planning future instruction, and therefore in supporting learning. For classroom assessments, a functional perspective that focuses on how well the assessment promotes learning by improving the quality of instruction is the central concern, and measurement characteristics are of concern mainly in terms of their impact on the effectiveness of the assessment in supporting teaching and learning.
The bottom line in validating classroom assessments (as in all assessments) is to identify the qualities that the assessment results need to have, given their particular interpretations and uses in the context at hand, and then to examine whether the assessment results meet these requirements.
The next section outlines an argument-based approach to validation, and the following section describes the functional and measurement perspectives on validation. The two perspectives are complementary in that each focuses on characteristics that are necessary for an effective assessment, but the relative importance of the two perspectives in evaluating an assessment will vary depending on the goals and contexts of the assessment. In the third section, we describe some uses of classroom assessments and examine how these assessments might be evaluated in terms of interpretations and uses and the two perspectives. We conclude that the functional perspective should be primary in classroom assessment, with the measurement perspective playing a supporting role in this context.

Argument-Based Approach to Validation

As indicated earlier, the validity of assessment interpretations and uses depends on the plausibility of the interpretation and the appropriateness of the uses. A natural approach to validation is to specify the interpretation and use, develop (or identify) an assessment program that would be expected to meet the specified requirements, and then evaluate how well the interpretations and uses are justified. Validation is most often associated with the last of these three steps, but in fact it depends critically on all three steps.
The argument-based approach to validation (Cronbach, 1988; Crooks, Kane, & Cohen, 1996; House, 1980; Kane, 2006, 2013; Shepard, 1993) provides a general framework for specifying and validating interpretations and uses of assessment results. If we are going to make claims and base decisions on assessment results, these claims and decisions should be well founded (AERA et al., 2014; Messick, 1989).
A relatively simple and effective way to specify proposed interpretation and uses of the assessment results is to develop an interpretation/use argument (IUA) that lays out the reasoning leading from observed assessment performances to the claims being made. The general idea is to identify the inferences and assumptions inherent in the interpretations and uses of the assessment results.
The argument-based approach is contingent in the sense that the structure of the validity argument and the conclusions reached about validity depend on the structure and content of the IUA. For modest interpretations that do not go much beyond the observed performances, the IUA will be modest, including few inferences and assumptions; for ambitious interpretations (involving broad generalizations, constructs, or predictions), the IUA will require strong inferences and supporting assumptions. If the IUA is found wanting, because it lacks coherence and completeness or because the evidence does not support some of its inferences and assumptions, the interpretation and use would not be accepted as valid. If the IUA is coherent and complete, and its inferences and assumptions are adequately supported, the proposed interpretation and uses can be considered valid. The inferences based on classroom assessments tend to be local and limited, and therefore do not require strong assumptions.

Interpretation/Use Arguments (IUAs)

The IUA is to provide an explicit statement of the sequence or network of inferences and supporting assumption that gets us from the observed performances to the claims based on these performances. The inferences are supported by warrants, which are general rules for making claims of a certain kind based on certain kinds of data. Warrants are based on assumptions and generally require backing, or support. For example, in drawing conclusions about a student’s level of competence in a domain on the basis of a sample of performances, we rely on a warrant that says that such generalizations are reasonable, and this warrant can be backed by evidence indicating that the sample is large enough and representative enough to support the generalization. The IUA would consist of a sequence or network of such inferences leading from the assessment results to the conclusions and decisions based on these performances.
The IUA provides a general framework for drawing inferences based on assessment results, and thereby for interpreting and using the assessment results for individual students. Although they may not be explicitly mentioned in discussing the results, the warrants for various inferences are integral parts of the IUA. Assuming that the warrants employed in the IUA are supported by appropriate evidence, the IUA provides justification for claims and decisions based on assessment results.

Validity Arguments

The validity argument provides an overall appraisal of the IUA, and thereby of the proposed interpretation and uses of the assessment results. It depends on the scope and content of the IUA, which specifies the inferences and assumptions that need to be evaluated. A simple interpretation in terms of skill in performing a particular kind of task (e.g., solving two-digit addition problems presented horizontally, such as ā€œ23 + 46 = . . .ā€) would focus on the adequacy of sampling of this type of task as a basis for deciding whether students can solve this kind of problem. Assessments of more broadly defined domains of skill would typically require more evidence and more kinds of evidence.
The validity argument starts with a critical review of the IUA, with particular attention given to identifying the most questionable inferences and assumptions. Many assumptions may be accepted without much discussion. Some assumptions may be evaluated in terms of the appropriateness of the procedures used (e.g., the relevance of observed performances to the skill of interest, the size of the sample of observations). Some assumptions (e.g., that the students were motivated to perform well) may be based on experience and/or observations made during the assessment.
In order to make a strong case for an interpretation or use of assessment results, the validity argument has to provide backing for the IUA as a whole, and particularly for its most questionable inferences and assumptions. Serious doubts about any inference or assumption can raise questions about the IUA as a whole. Therefore, the IUA needs to be understood in enough detail so that the inferences and assumptions on which it depends can be identified and evaluated. A validity argument is never definitive because we cannot exhaustively evaluate all of the IUA, and therefore the most doubtful parts of the argument should get the most attention. As Cronbach (1980) suggested, ā€œThe job of validation is not to support an interpretation, but to find out what might be wrong with it. A proposition deserves some degree of trust only when it has survived serious attempts to falsify itā€ (p. 103). The question is whether the interpretation and use of the assessment results makes sense, given all of the evidence.
Note that it is not necessary to be concerned about assumptions that are not included in the IUA. For example, if the proposed interpretation and use assumes that the attribute being assessed would not vary much over extended periods of time, we would be concerned about the extent to which the performances are stable over time. But if the characteristics being assessed are expected to vary (e.g., due to learning), stability would not be required, and it might even constitute evidence against the validity (the instructional sensitivity) of the assessment.
The basic ideas guiding the argument-based approach is that we should be clear about the reasoning that is to take us from observed student performances to conclusions about the student, and that we should critically evaluate this reasoning and its embedded assumptions.

Perspectives on Assessment

Assessments can be evaluated from multiple perspectives, and it is generally helpful to consider the evaluative criteria associated with different perspectives (Cronbach, 1988: Dorans, 2012; Holland, 1994). Different perspectives focus on different aspects of interpretation and use, and therefore on different criteria for evaluating validity. The perspectives are not mutually exclusive, and any that are relevant in a particular case deserve attention.
Addressing concerns about the assessments’ interpretation and use from multiple perspectives may seem like a major burden, but it is not particularly burdensome if the evaluation is approached reasonably; in fact, it may facilitate the process of validation. It has long been recognized that validation requires that the assessment results be evaluated by identifying potential challenges (e.g., sources of bias, construct-irrelevant variance, construct underrepresentation) and evaluating their impact (Cronbach, 1988), and the different perspectives can be a fruitful source of legitimate challenges to proposed interpretations and uses.
We will consider two perspectives on classroom assessment, the functional perspective and the measurement perspective. As noted earlier, the functional perspective focuses on how well the assessments support the attainment of various goals in some contexts, while a measurement perspective focuses on the assessment as a measurement instrument (i.e., in terms of precision and accuracy of the results). Assessment uses need to achieve the purpose for which they are intended, and they need to be defensible as measurements. Both perspectives can be accommodated in an argument-based approach to validation that supports the claims inherent in the intended interpretations and uses of assessment results, and that addresses challenges to these interpretations or uses.

The Functional and Measurement Perspectives

The functional perspective (Cronbach, 1988) views assessments primarily as tools that can be helpful in realizing desired outcomes, and therefore it focuses on how well the intended outcomes are achieved and on the extent to which undesirable outcomes are avoided. From a functional perspective, an assessment is evaluated mainly in terms of its consequences, intended and unintended.
Cronbach (1988) begins his discussion of the functional perspective by contrasting it with more descriptive concerns about the accuracy of interpretations:
The literature on validation has concentrated on the truthfulness of test interpretations, but the functionalist is more concerned with worth than truth. In the very earliest discussions of test validity, some writers said that a test is valid if it measures ā€œwhat it purports to measure.ā€ That raised in a primitive form, a question about truth. Other early writers, saying that a test is valid if it serves the purpose for which it is used, raised a question about worth. Truthfulness is an element in worth, but the two are not tightly linked.
(p. 5)
The functional perspective is concerned with the functional worth, or utility, of the assessment in achieving the goals that it is intended to help achieve. An assessment is implemented to achieve some purpose, and it is evaluated in terms of its functional worth in achieving this purpose.
The measurement perspective views assessments primarily as measurement instruments, and as a result it focuses on certain technical criteria, particularly the generalizability (or reliability) of scores and their accuracy as estimates of the attribute of interest. It emphasizes standardization and objectivity (Porter, 2003) and generally relies on statistical m...

Table of contents

  1. Cover Page
  2. Half Title
  3. Series Page
  4. Title Page
  5. Copyright Page
  6. Table of Contents
  7. Contributors
  8. Introduction
  9. Part I: Classroom Assessment Information
  10. Part II: The Use of Classroom Assessment Information to Enhance Learning
  11. Part III: Emerging Issues in Classroom Assessment
  12. Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.5M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1.5 million books across 990+ topics, we’ve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access Classroom Assessment and Educational Measurement by Susan M. Brookhart, James H. McMillan, Susan M. Brookhart,James H. McMillan in PDF and/or ePUB format, as well as other popular books in Education & Education General. We have over 1.5 million books available in our catalogue for you to explore.