Psychology

Standardization and Norms

Standardization in psychology refers to the process of establishing uniform procedures for administering and scoring tests. Norms, on the other hand, are the established standards of performance based on the results of a standardized test. These norms provide a frame of reference for comparing an individual's performance to that of a larger group.

Written by Perlego with AI-assistance

11 Key excerpts on "Standardization and Norms"

  • Book cover image for: A Handbook of Test Construction
    eBook - ePub

    A Handbook of Test Construction

    Introduction to Psychometric Design

    • Paul Kline(Author)
    • 2015(Publication Date)
    • Routledge
      (Publisher)
    8 Standardizing the test
    In chapter 1 it was made clear that one of the advantages possessed by psychological tests in comparison with other forms of measurement is that tests are standardized. Hence it is possible to compare a subject’s score with that of the general population or other relevant groups, thus enabling the tester to make meaningful interpretations of the score.
    From this it follows that the standardization of tests is most important where scores of subjects are compared explicitly or implicitly – as in vocational guidance or educational selection. Norms may also be useful for mass-screening purposes. For the use of psychological tests in the scientific study of human attributes – the psychometrics of individual differences – norms are not as useful. For this the direct, raw test-scores are satisfactory. Thus norms meet the demand, in general, of the practical test user in applied psychology. Since norms are usually necessary for tests of ability, our discussion of how a test should be standardized will relate in the main to such tests.
    Sampling
    This is the crucial aspect of standardization: all depends upon the sample. In sampling there are two important variables: size and representativeness. The sample must accurately reflect the target population at which the test is aimed (of course, there may be several populations and consequently several samples), and it must be sufficiently large to reduce the standard errors of the normative data to negligible proportions.
    Size
    For the simple reduction of statistical error a sample size of 500 is certainly adequate. However, the representativeness of a sample is not independent of size. A general population norm, for example, of school-children would require in the region of 10,000 subjects. A sample from a limited population such as lion-tamers or fire-eaters would not have to be so large (indeed, the population would hardly be that large). Thus no statement about sample size can be made without relating it to the population from which it is derived. This discussion clarifies the point that more important than size is the representativeness of the sample. A small but representative normative sample is far superior to a large but biased sample. Some examples taken from actual tests will make this point obvious and will also indicate the best methods for test constructors of obtaining standardization samples.
  • Book cover image for: Psychometric Methods
    eBook - PDF

    Psychometric Methods

    Theory into Practice

    In psychological (and educational) measurement and testing, we face substan- tially more challenges than those encountered in standard physical measurements. In 408 PSYCHOMETRIC METHODS fact, psychological measurement and testing is a complex endeavor requiring detailed information and guidance on the proper use of test scores. Psychological testing is com- plex because of (1) the multidimensional nature of constructs, attributes, and behaviors being measured and (2) the multitude of tests available to measure a particular construct, attribute, or behavior. Finally, we acknowledge that psychological measurement is more imprecise relative to measurement in the physical sciences. Given the challenges men- tioned, we turn next to the topic of norms, the process of norming, and norm-referenced testing. In this chapter, norms are defined along with a rationale for their use. The process of norming is described with examples using the GfGc data. A description of norm- referenced testing is provided along with its proper use. Test equating is introduced, and examples are provided specific to how equating of test scores works using the GfGc data. 11.2 NORMS, NORMING, AND NORM-REFERENCED TESTING Most standardized tests of achievement and ability in psychology and education use norms such as percentiles, age, or grade equivalents and standard scores. A standard score is a raw score converted from one scale to another where the latter employs an arbitrary mean and standard deviation. Standard scores are more easily interpreted than raw scores, and the position of an examinee’s performance relative to other examinees is clearly indexed. The term norm is used in the scholarly literature to refer to a behav- ior that is usual, average, normal, standard, expected, or typical (Cohen & Swerdlik, 2010, p. 111).
  • Book cover image for: Principles of Assessment and Outcome Measurement for Allied Health Professionals
    • Alison Laver-Fawcett, Diane L. Cox, Alison J. Laver-Fawcett(Authors)
    • 2021(Publication Date)
    • Wiley-Blackwell
      (Publisher)
    229 S E C T I O N 2 CONCEPTS FOR ASSESSMENT AND MEASUREMENT 231 Principles of Assessment and Outcome Measurement for Allied Health Professionals: Practice, Research and Development , Second Edition. Alison J. Laver-Fawcett and Diane L. Cox. © 2021 John Wiley & Sons Ltd. Published 2021 by John Wiley & Sons Ltd. C H A P T E R 7 Standardisation OVERVIEW This chapter discusses what is meant by standardisation and a standardised test. The chapter also explores the definitions of key terms, including the normal distribution, percentile ranks, standard deviation, and mean. See Chapter 1 to review terminology. The process for developing an assessment for use in practice is described in Chapter 14. QUESTIONS TO CONSIDER 1. What is standardisation? 2. What are the benefits of standardisation versus non-standardisation? 3. What is a normal distribution curve? Why is this important in standardisation? 4. What is standard deviation? STANDARDISATION Standardisation (see Chapter 1, p. 22) is the process of taking an assessment and developing a fixed protocol for its administration and scoring, and then conducting psychometric studies (see Chapter 15) to evaluate whether the resulting assessment has acceptable levels of validity and reliability. There are two ways in which assess-ments can be standardised: either in terms of procedures, materials, and scoring or in 232 Chapter 7 Standardisation terms of normative standardisation (de Clive-Lowe 1996). The first method of stan-dardisation involves the provision of detailed descriptions and directions for the test materials; method of administration; instructions for administration; scoring; and interpretation of scores (Jones 1991; see Chapter 14). Standardisation ‘extends to the exact materials employed, time limits, oral instructions, preliminary demonstrations, ways of handling queries from test takers, and every other detail of the testing situation’ (Anastasi 1988, p. 25).
  • Book cover image for: Developing Norm-Referenced Standardized Tests
    Chapter 4Standardizing an Assessment James Gyurke James Gyurke, PhD, is Project Director for Infancy and Early Childhood at The Psychological Corporation, 555 Academic Court, San Antonio, TX 78204-0952. Aurelio Prifitera Aurelio Prifitera, PhD, is Senior Project Director in Neuropsychology at The Psychological Corporation.
    What is well done is done soon enough.
    –Seigneur Du Bartas Divine Weekes and Workes (1578)

    INTRODUCTION

    The primary task for many therapists is to obtain answers to questions they have about a child’s functioning. These questions are typically of the form: Does this child need services?; Does this child continue to need the level of services s/he has been getting? or; Has this child benefited from the services s/he has received? The most practical and efficient way to obtain an answer to any or all of these questions is to employ a standardized assessment instrument.
    Standardized tests, as the name suggests, are methods that rely upon uniform administration and scoring procedures for obtaining a sample of behavior.1 Implicit in this definition is that across each and every administration of a standardized assessment, the examiner attempts to keep testing conditions, item administration procedures, and scoring procedures consistent with guidelines set forth in the testing manual. The rationale for this is quite simple; by adhering to standardized procedures, the examiner is able to compare the results of one testing to those of another, and thus, provide meaning to those results.
    Following is a discussion of the major steps involved in standardizing a test. This information is intended to alert the reader to the general issues involved in standardizing a norm-referenced assessment. Specific standardization procedures differ depending upon the type of test, the size of the standardization sample, and the amount of data being collected.
  • Book cover image for: Norms in Human Development
    (Durkheim, 1901, pp. 87, 91) As in psychometrics, a norm is a description of what is common in a group of people in contrast to the exceptional or morbid which is individually specific. This interpretation amounts to the social version of methodological behaviourism. In consequence, it shares the self-limiting consequence of psychometrics in being officially precluded from distinguishing ‘what is common in some group’ from ‘what ought to be common in that group’. Yet under Durkheim’s account, the latter is not the stuff of science. Evidently, the President of Harvard University was siding with this interpretation in stating that recent advances in behavioural gene- tics, rather than socialization theory, had now established normality distributions for human attributes such as height, weight, propensity for criminality, overall IQ, mathematical ability, scientific ability [since there is] a difference in the standard deviation, and variability of a male and a female population. (Summers, 2005) He also made a significant admission, that this evidence ‘ought to influ- ence the way one thought about other areas where there was a percep- tion of the importance of socialization’. This admission is normative in a non-descriptive sense. Left unexplained was the basis of this normative judgment in behavioural genetics, currently regarded as a non-normative science. Norms in human development: introduction 11 Social psychology: norm as social control In this interpretation, norms are directives from other people in author- ity, investigable in psychology by reference to their effects in social action. The elegant summary contained in the title of a famous book, Obedi- ence to authority, carried the implication that norms are just that. On this view, social action corresponds to behavoural compliance, that is ‘the action of a subject when he or she goes along with his peers’. This com- pliance is social, behavioural and observable.
  • Book cover image for: Geriatric Neuropsychology
    eBook - PDF

    Geriatric Neuropsychology

    Assessment and Intervention

    • Deborah K. Attix, Kathleen A. Welsh-Bohmer, Deborah K. Attix, Kathleen A. Welsh-Bohmer(Authors)
    • 2013(Publication Date)
    Although this is not a problem if one is interested only in a comparison of the experimental (or quasi-experimental) group and the control group, it may lead to serious misinterpretations of results, should one use such a control group as a reference for clinical decision making. Test users should therefore be cautious when comparing an elder patient to these norms, and remember that norms are context specific. Defining the Normative Standard The term norm implies something that is common, standard, and normal. This statement begs the question, “Normal or standard for what?” There are potentially as many normal populations as there are assessment questions. For example, the Graduate Record Exami- nation (GRE; Educational Testing Service, 2004) is designed to aid in the selection of fu- ture graduate students, and as such its norms were developed on a narrowly defined group of graduate school applicants. On the other hand, the WAIS-III (Wechsler, 1997a) is designed for placing individuals’ intellectual performance on the continuum that theo- retically encompasses the entire U.S. population. As such, this instrument was normed on a much broader and inclusive sample. As can be seen from these illustrations, the intended assessment purposes and client/ patient characteristics provide the conceptual framework for the composition of compari- son populations. This framework must then be operationalized by clearly stated and well- defined inclusion and exclusion criteria. Inclusion criteria are typically generous and in- clude points such as being of the appropriate age and able to comprehend instructions.
  • Book cover image for: Handbook of Educational Theories
    P SYCHOMETRIC T HEORY : I NDIVIDUAL N ORM -R EFERENCED S TANDARDIZED A SSESSMENT Psychometric theory for individual, norm-referenced, standardized assessment rests on several essential foundations. F OUNDATIONS OF P SYCHOMETRIC T HEORY • Persons possess, and differ in, certain abilities, such as vocabulary, visual-motor speed, and reading com-prehension. • These individual abilities are manifested in—and can be inferred from—overt behaviors, such as defining words, answering factual questions, solving puzzles, rapidly copying rows of symbols with a pencil, and correctly filling in missing words removed from a reading passage. • Tests can be constructed to measure performance on a relatively small sample of the overt behaviors, such as a list of words to be defined, a list of questions or 740 J. O. WILLIS, R. DUMONT, and A. S. KAUFMAN set of puzzles, a page with rows of symbols to be copied, or a set of reading passages with every fifth word blanked out. • The test can be “standardized” with a standard set of instructions that includes the precise items, particu-lar wording of instructions and other aspects of test administration, and rules for scoring responses to the test items (such as pass/fail, different point val-ues for various possible responses, or scores based on the time required to complete the task correctly). • The strength of the underlying ability can be esti-mated from performance on the test. • Test performance can be summarized by a score on the test, such as the number of items completed cor-rectly or points earned according to the standard-ized scoring rules. • A standardized test can be normed by administering it to a representative sample of people and using those persons’ scores as a yardstick for assessing the performance of other persons who take the test.
  • Book cover image for: Principles of Assessment and Outcome Measurement for Occupational Therapists and Physiotherapists
    • Alison Laver Fawcett(Author)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    standardised ? The word standard has been defined as a ‘weight or measure to which others conform or by which the accuracy or quality of others is judged’ or ‘a degree of excellence etc. required for a particular purpose’ and the verb to standardise means to ‘make conform to a standard; determine properties of by comparison with a standard’ (Sykes, 1983). In the context of therapy, the AOTA defines the word standardised as ‘made standard or uniform; to be used without variation; suggests an invariable way in which a test is to be used, as well as denoting the extent to which the results of the test may be considered to be both valid and reliable’ (American Occupational Therapy Association, 1984, cited by Hopkins and Smith, 1993b, p. 914). Standardisation, therefore, ‘implies uniformity of procedure in administering and scoring [a] test’ (Anastasi, 1988, p. 25).
    Cole et al . (1995) describe a standardised test as a measurement tool that is published and has been designed for a specific purpose for use with a particular population. They state that a standardised test should have detailed instructions explaining how and when it should be administered and scored and how to interpret scores. It should also present the results of investigations to evaluate the measure’s psychometric properties. These instructions are usually contained in a test protocol, which describes the specific procedures that must be followed when assessing a client (Christiansen and Baum, 1991). Details of any investigations of reliability and validity should also be given. In order to maintain standardisation, the assessment must be administered according to the testing protocol. The conditions under which standardised tests are administered have to be exactly the same if the results recorded from different clients by different therapists, or from the same person by the same therapist on different occasions, are to be comparable.
    Being standardised does not necessarily mean that the test is an objective measure of externally observable data. Therapists also develop standardised tools that enable clinicians to record internal, unobservable constructs, for example a person’s self-report of feelings, such as pain, sadness or anxiety. However, any standardised test, whether of observable behaviours or psychological constructs, should be structured so that the method of data collection will yield the same responses for a person at a specific moment in time and the same responses for the person being tested regardless of which therapist is administering the test.
  • Book cover image for: Using Assessment Results for Career Development
    The usefulness of assessment results in career counseling is determined by norms. In using norms, the practitioner should consider: When norms should be used; What kind of norms should be used; and How much weight should be given to norms. Norms represent the level of performance obtained by the individuals (normative sample) used in developing score standards. Norms can thus be thought of as typical or normal scores. Norms for some tests and inventories are based on the general population. Other norms are based on specific groups such as all 12th-grade students, 12th-grade students who plan to attend college, left-handed individuals, former drug abusers, former alcoholics, or individuals with physical disabilities. Norm Tables The organization of norm tables varies somewhat from test to test. For example, the manual for the Self-Directed Search lists separate norms for males and females by middle school, high school, college and adult levels for two letter and three letter Holland types (Holland & Messer, 2013). The ASVAB provides norms in terms of gender (male, female, combined gender) and other demographic characteristics. The normative sample description is critical for understanding if a test should be used. In some manuals, only a brief description is given, leaving practitioners to assume that their clients resemble the normative population. Others, such as the Kuder Career Search, provide specific definitions of normative groups. Such detailed descriptions of persons sampled in standardizing an inventory provide good data for comparing the norm samples with client groups. In many instances, more information would be useful, such as score differences between age and ethnic groups and between individuals in different geographical locations. The more descriptive the norms are, the greater their utility and flexibility.
  • Book cover image for: Fundamentals of Psychological Assessment and Testing
    • John M. Spores(Author)
    • 2023(Publication Date)
    • Routledge
      (Publisher)
    Box 6.2 instructions).
    At this juncture, the psychologist has created the basic framework for her measure. Next, it needs empirical or factual support demonstrating that it is both reasonably precise and accurate. This broaches the important issues of test reliability and validity, respectively, which shall be elaborated upon subsequently.

    The Meaning of Test Scores

    The meaning of test scores depends upon whether the measure is either (a) criterion-referenced or (b) norm-referenced. Regarding the former, a criterion is a standard or objective; for example, mastery of at least 80% of the material. Criterion-referenced test scores identify a pre-determined cut-off above which the standard is interpreted as achieved (i.e., passed) and below which the standard is interpreted as not achieved (i.e., failed). Criterion-referenced psychological tests are employed most frequently when determining whether a requisite level of mastery has been realized. An example is a malingering (i.e., faking mental illness for an identified reinforcer) examination in which those applicants scoring 80% or above are deemed to be feigning (i.e., faking pathology) mental illness, whereas those scoring below 80% are judged as be genuine in their report of symptoms.
    Alternatively, a psychological measure can be norm-referenced test. In this instance, the meaning of an examinee’s score is derived by comparing it to scores previously determined by a sample of individuals who are comparable to the examinee in essential characteristics most likely to influence the test data; that is, the comparison must be unbiased and equitable. Such vital characteristics frequently include age, gender, race, ethnicity, educational level, socioeconomic status, and geographic region. Such a group determines what is an average, high, or low score and is referred to by various terms. These include norm reference sample, normative sample, norm group, standardization sample, or some combination of these terms. These are used interchangeably in the literature and subsequently throughout the remaining parts of this chapter. In addition, exclusionary criteria are regularly utilized. Exclusionary criteria
  • Book cover image for: Principles and Applications of Assessment in Counseling
    Frequency polygons and histograms graphically display the distribution of scores so that practitioners can easily identify trends. Measures of central tendency (i.e., mode, median, and mean) provide benchmarks of the middle or central scores. The mean or average score is often used to interpret assessment results. The measures of variability (i.e., range, variance, and standard deviation) indicate how scores vary and where an individual’s score may fall in relation to the scores of others. Standard deviation is the most widely used measure of variability in assessment. Standard deviation has some important qualities if the distribution approximates the normal curve. With a normal distribution, 68% of the norming group will fall between 1 standard devia-tion above the mean and 1 standard deviation below the mean. With a normal distri-bution, the mean, mode, and median will all fall at the same point. There are numerous methods for transforming raw scores in norm-referenced instruments. Percentiles provide an indication of what percentage of the norming group had a score at or below a client’s. Standard scores, which convert raw scores so that there is always a set mean and standard deviation, express an individual’s distance from the mean in terms of the standard deviation of the distribution. The most basic standard score is the z score, which has a mean of 0 and a standard deviation of 1. Counsel-ors need to be careful when they use instruments that incorporate age equivalent or grade equivalent norms; they must know precisely how the results were calculated and how to interpret them appropriately. Furthermore, counselors need to fully examine the norming group of any instrument and understand the strengths and limitations of that group. Visit Cengage.com for a variety of study tools and useful resources such as video examples, case studies, interactive exercises, flashcards, and quizzes. Copyright 2017 Cengage Learning. All Rights Reserved.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.