Chapter One Using Data in Deliberate and Thoughtful Ways
Thomas R. Guskey University of Kentucky
The Standards for Professional Learning are designed to guide educators in making thoughtful decisions about professional learning experiences that will increase āeducator effectiveness and results for all studentsā (Learning Forward, 2011). Accomplishing this primary goal requires that those decisions be made based on relevant data. This requirement, in turn, makes the Data Standard an essential foundation for all of the other standards.
The Data Standard states: āProfessional learning that increases educator effectiveness and results for all students uses a variety of sources and types of student, educator, and system data to plan, assess, and evaluate professional learningā (Learning Forward, 2011). Because of its indispensable and fundamental nature, no other standard is more important or more vital to the purpose of the Standards for Professional Learning.
In this chapter we will explore the meaning of ādataā in the context of professional learning. We will consider the various types of data, the different purposes data can serve, and the many levels at which relevant data can be gathered and analyzed. Finally we turn to the use of data in evaluating professional learning endeavors and how to ensure those evaluations are meaningful and effective.
What Are Data?
Data are defined as āfactual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculationā (see http://www.merriam-webster.com/dictionary/data). In other words, data are what we know. Some argue that data are not always factual and can be erroneous. Indeed, some of what we know or believe we know may be inaccurate. But in most cases those inaccuracies stem from the way the data were gathered (e.g., via biased or inadequate sampling) or the manner in which they were interpreted (e.g., through naĆÆve, distorted, or prejudicial perspectives). Apart from these distortions, data represent facts.
By themselves, data are neither good nor bad; neither positive nor negative. Moreover, data have no meaning or intrinsic value when considered in isolation. They can be relevant or irrelevant, pertinent or immaterial. Data become meaningful and valuable only when processed, usually for the purpose of answering specific questions.
The usefulness of any particular set of data depends, therefore, on the context in which it is gathered, processed, and applied. For this reason, the key to successful data-based decision making rests not in the data, per se. Although data are essential to making apt decisions, the quality and appropriateness of particular data depend on their accuracy and relevance in answering specific questions in a particular context.
Because of the context-specific nature of data, discussion of dataās appropriateness always must be preceded by the formulation of specific, essential questions. These questions provide the basis for all forms of inquiry, be it exploration, description, research, or evaluation. They guide selection of the most appropriate and meaningful data needed to answer the questions. They also determine the level of data needed, the type of analysis required, and the best means of reporting analysis results.
The most important first step in data-based decision making, therefore, is not a discussion of the data and how it will be analyzed or āmined.ā Rather, it is a discussion of the most important and essential questions that need to be addressed. Only when these questions are clearly articulated and agreed upon can meaningful discussions take place about what evidence, information, or data will be the most appropriate for answering those questions. After deciding what data are most appropriate, other issues regarding data collection and analysis become easier to address. Decisions about how to interpret the analysis, formulate conclusions, and derive implications for policy and practice become easier as well.
Multiple Sources of Data
There are an infinite number of sources of data. In most modern education reforms, however, and particularly those guided by the requirements of the No Child Left Behind (NCLB) legislation (U.S. Congress, 2001), the primary data of interest are evidence on student learning derived from the results of large-scale assessments. Policy makers and legislators at the national, state, and provincial levels are attracted to large-scale assessment results as measures of reform success because they can be relatively inexpensive, relatively quick to implement, externally mandated, and the results are highly visible (Linn, 2000). These same policy makers and legislators also are convinced that good data on student performance drawn from large-scale assessments will help focus educatorsā attention and guarantee success, especially if consequences are attached to assessment results (see Elmore, 2004).
The large-scale assessment programs in most states and provinces are designed to measure studentsā āproficiencyā on carefully articulated standards for student learning. The specific criteria that states and provinces use to define proficiency vary greatly in stringency and rigor, however, especially when compared to other assessment results, such as those from the National Assessment of Educational Progress (NAEP) in the United States (Linn, 2005; Peterson & Hess, 2005). These differences become more significant when the results from large-scale assessments are used to evaluate schools, students, or educatorsā professional learning for the purposes of accountability. Such initiatives affect numerous stakeholder groups, including school administrators, teachers, students, parents, school board members, future employers, and the community (Linn, 2003). But because the intent of most statesā assessment and accountability programs is to monitor and improve the educational system, the stakes are highest for school administrators and teachers (Lane & Stone, 2002).
While the psychometric quality and validity of large-scale assessments for accountability purposes are widely debated (see Hill & DePascale, 2003; M. Kane, 2002), one point on which both advocates and critics agree is that they represent but one, potentially limited, indicator of student learning that might be considered in making decisions about schools, students, or professional learning. Significant evidence shows, for instance, that schoolsā results on large-scale assessments can be highly volatile (T. Kane & Staiger, 2002), and that relatively minor changes in design features for reporting results can lead to strikingly different categorizations of schools (Porter, Linn, & Trimble, 2005).
Concerns about these reliability and validity issues have led to calls from professional educational organizations for protection against high-stakes decisions based on results from single tests or assessments (American Educational Research Association, 2000). The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999) specifically state, āIn educational settings, a decision or characterization that will have major impact on a student should not be made on the basis of a single test score. Other relevant information should be taken into account if it will enhance the overall validity of the decisionā (pp. 147ā148). Similarly, the Standards for Professional Learning (Learning Forward, 2011) emphasize, āData from multiple sources enrich decisions about professional learning that leads to increased results for every student. . . . The use of multiple sources of data offers a balanced and more comprehensive analysis of student, educator, and system performance than any single type or source of data canā (pp. 4ā5).
Yet despite these appeals, the exclusive use of large-scale assessment results for making decisions about schools, students, and even professional learning remains widespread (Barton, 2002; Hess, 2005; Kifer, 2001). In some contexts, however, the use of multiple measures appears to be gaining ground. Title I of the Elementary and Secondary Education Act (ESEA), for example, stipulates that multiple measures should be used to evaluate schools with respect to their academic standards. The rationale...