The benefits of performance appraisal in the business world have caused an upsurge of books and programs for use in management, but few of the methods described bother to verify that the underlying psychology on which they are based holds true. Angelo DeNisi has spent 10 years conducting research into cognitive processes, particularly those of the rater, in performance appraisal.
A Cognitive Appraisal is a careful and thorough investigation of appraisal decisions. Based on experiments conducted with over 300 participants, Angelo DeNisi presents results from both the laboratory and real life settings into this vital area. The evidence described will be invaluable to all those involved in assessing the validity of particular performance 'packages' for use by themselves or their clients and to other researchers in appraisal techniques. It is also an excellent guide for all psychologists who wish to verify their results in the field as it contains the story of a long term research program encompassing the move from lab to field, successfully.

- 240 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
A Cognitive Approach to Performance Appraisal
About this book
Trusted byĀ 375,005 students
Access to over 1 million titles for a fair monthly price.
Study more efficiently using our study tools.
Information
Subtopic
Cognitive Psychology & CognitionIndex
PsychologyChapter 1
Why a cognitive approach?
There are few decisions made in modern organizations that do not, somehow depend upon performance appraisals. Although it may be possible in a limited number of jobs to obtain objective performance information, more typically this is not the case. Instead, organizations are forced to rely upon some type of subjective evaluation of a personās performance. Systems that rely upon goals (e.g., MBO) still retain some aspect of subjectivity and judgment, even if it is just about what constitutes a meaningful goal. Given the subjective nature of these appraisals, it is not surprising that there have been volumes written about the errors, bias, inaccuracy, and inherent unfairness of most performance appraisals. What is perhaps more surprising, however, is that appraisals continue to be used, and used widely in most organizations.
For example, many, if not most compensation systems include some type of merit pay component where a portion of a personās pay is determined by their performance on the job. In fact, more recent views of managerial compensation (e.g., Gerhart & Milkovich, 1993; Gomez-Mejia & Balkin, 1992; Gomez-Mejia & Welbourne, 1988), adopting an Agency Theory perspective (cf., Eisenhardt, 1989), have recommended that more of the executiveās compensation should be āput at riskā and made dependent upon performance. Although in some of these cases, compensation is tied to accounting measures (e.g., Return of Equity) or financial performance (e.g., stock price), these trends suggest that performance will be even more important for determining compensation in the future. Furthermore, since accounting and financial indices are less relevant for assessing the performance of middle- to lower-levels managers, and since few other objective measures are generally available to gauge their performance, merit-pay decisions, based upon a performance appraisal will only become more important over time. It is worth noting, however, that there is a different perspective which suggests that, in the future, compensation decisions will be based more on skill or knowledge acquisition than upon performance (see discussions by Dewey, 1994; Gupta, Ledford, Jenkins & Doty, 1992; Tosi & Tosi, 1986).
Needs analysis, as the basis for the design and implementation of training programs, is also usually based upon these subjective evaluations of performance. That is, organizations typically include a performance appraisal as a critical part of establishing where training is needed and what kinds of training are needed (cf., Wexley & Latham, 1981). Furthermore, after the training has been implemented, program evaluation is typically concerned with changes in performance, among other potential outcomes and, in most cases, this performance and any changes are measured by a performance appraisal.
Performance appraisals are used as the basis for taking disciplinary action, in many cases, where the perception of performance that falls short of a standard or expectation typically triggers the action (cf., Arvey & Jones, 1985) and, in extreme cases, this action can include the decision to terminate an individual. Of course, performance appraisals are a major part of performance management programs, including the various coaching and developmental activities that take place as part of the performance management process. Finally, in cases where organizations need to validate selection techniques such as tests, or to answer questions about the adverse impact of such techniques, the criterion measure involved is typically some measure of performance which, in most cases, is a performance appraisal.
Furthermore, since performance appraisals are usually the only measure of performance available, they are also used as the criterion measures for a wide range of organizational topics. Thus, for example, when we discuss things such as organizational commitment or job satisfaction, we usually try to establish relationships between these variables and performanceāoften measured using performance appraisal. Also, training efforts, as well as other types of interventions, are also usually aimed at improving performance and/or productivity (although see Goldstein, 1993, for a discussion of the importance of determining the proper criterion measure for evaluating training efforts), and so performance appraisals play a role in evaluating these interventions as well.
The fact that appraisals are so important, and yet so prone to problems, goes far to explain why performance appraisal has been the focus of so much research activity for so long a period of time. It is intolerable to many managers, used to making rational decisions and having control over situations, to have to depend so much upon a measurement technique that inspires little trust. Therefore, for almost seventy years, scholars have been studying performance appraisals to understand what makes them so poor as indicators of ātrueā performance. As we shall see, even the definition of what is meant by ātrueā performance is open for debate, and the problems of how we would know if we did measure true performance accurately are also considerable. Nonetheless, through it all has been the hope that, if we could understand what causes all the problems with performance appraisals, we could figure out what to do to make them work better.
This book is about a program of research which has adopted a cognitive, process-oriented focus on performance appraisal. But this approach is only one of the latest approaches to studying the problem. In order to understand how we came to adopt such an approach, it is necessary to understand the state of appraisal research up until the 1980s. The now classic paper by Landy and Farr (1980) marked the beginning of the ācognitive eraā in performance appraisal research, and the program of research discussed here is one example of the kind of research that adopts a cognitive approach. But how did we come to this approach? Why did process-oriented research on appraisals become so popular? What kinds of research did people do before (and still do now), that was not process oriented, or concerned with cognition? In order to provide this background, I will begin with a brief history of the research on performance appraisal. Other such reviews can be found in Landy and Farr (1980) and Feldman (1994), as well as in the appraisal texts by Cardy and Dobbins (1994) and Murphy and Cleveland (1991), while slightly less exhaustive reviews are also presented with the cognitive appraisal model proposed by DeNisi, Cafferty, and Meglino (1984) and the model proposed by Feldman (1981).
This historical view will then provide the introduction to some of the betterknown cognitive appraisal models that have been proposed. The second chapter will present our cognitive model (the DeNisi, Cafferty, & Meglino, 1984 model), as well as a number of the research propositions that were generated by that model. These propositions were the beginning of the research program described here, and Chapter 3 (especially) describes a series of studies that were designed to test some of these propositions. But over time, the research program was dictated by the results of some of the studies we conducted. Thus, in Chapter 4 I begin discussing studies dealing with reprocessing objectives and interventions designed to help raters organize performance information. These two themes were not part of the original set of research propositions, but the results of the studies designed to relate acquisition strategies to ratings suggested that these were important directions for our research to follow. Chapter 5 returns to some of the original research propositions, but also describes the studies that began with simple ideas about affect and bias (which were part of the original propositions), but which eventually moved the program further in those directions than had been anticipated. Chapter 7 traces the progression of the research program from the lab, where all the prior studies had been conducted, to the field, and demonstrates that this line of research did have external validity. Finally, the last two chapters provide discussions of what I think we have learned from this research, and some ideas about where we should go from here. I begin, then, with a brief history of research on performance appraisal to set the stage for everything else to come.
RESEARCH ON PERFORMANCE APPRAISAL: AN HISTORICAL PERSPECTIVE
Interest in the evaluation of performance probably goes back well over a thousand years (see Murphy & Cleveland, 1991), but published research on appraisals goes back at least to 1920 with Thorndikeās (1920) paper on rating errors. This paper not only represents one of the earliest attempts to deal with appraisal problems, it also represents the beginning of a very specific orientation to these problems that has lasted almost until today. Why should we be concerned about an āerrorā that causes ratings on different scales to be correlated with each other? At the time these studies were conducted, it was generally assumed that, if we could reduce rating errors we would also increase rating accuracy. In fact, throughout much of the history of appraisal research, rating errors were seen as a proxy of rating (in)accuracy. Later in the book, this issue will be revisited, as there is no longer any real agreement that such correlations constitute an error at all. In fact, as we shall see later, there is disagreement over whether the relationship between these āerrorsā and accuracy is actually positive or negative, as well as disagreement over how accuracy should be operationalized, and even over whether accuracy should be the focus of our attention at all. But all of this was to come much later, in the 1920s there was general interest in reducing rating errors as a means of increasing rating accuracy.
It is also clear, in retrospect, that Thorndikeās paper coincided with the widespread introduction of graphic rating scales, and that some of the āhaloā being discussed was seen as resulting from the use of these scales (also see Rudd, 1921, for criticisms of graphic rating scales). Thus we were well down several paths that would influence (and perhaps hinder) appraisal research for many years to come: there was a focus upon rating errors; there was an assumption that reducing errors would result in an increase in rating accuracy; and there was the assumption that rating errors were due, at least in part, to the nature of the rating scale being used.
Of course, even early on in the process, there were voices that were fighting against this tide. Bingham (1939) argued that not all of what we called āhalo errorā should be considered āerror,ā as some of the observed covariance could be attributed to ātrueā covariance. Similar arguments were voiced again much later, by Cooper (1981b), when they received more attention, but Binghamās (1939) protestations seemed to have little impact. Even more noteworthy, in 1952, Robert Wherry Sr. (Wherry, 1952) proposed a model of the appraisal process that was not only way beyond the simplistic views of appraisals and errors that were prevalent at the time, but also anticipated many of the āinnovationsā that came with the cognitive approach to performance appraisal of which our research program is part. We will discuss this model in more detail later, but, in the 1950s, this model was available only as an unpublished technical report, and it had little influence on the practice or research in performance appraisal. In fact, the paper was not even generally available until thirty years later, when a slightly abridged version was published (Wherry & Bartlett, 1982), and so this forward thinking model actually had little impact on the early cognitive models, even though those models drew much of their support from the same sources as were tapped by Wherry (1952).
For the most part, then, the focus was on reducing rating errors, and the preferred solution appeared to lie with the nature of the rating scale. Furthermore, this focus and preference seemed to stay with us for quite a while. As a result, research concentrated on the best ways to construct rating scales in order to reduce rating errors, emphasizing such things as the number of scale alternatives and the degree to which traits are defined (e.g., Barrett, Taylor, Parker, & Martens, 1958; Bayeroff, Haggerty, & Rundquist, 1954; Blumberg, De Soto, & Keuthe, 1966; Peters & McCormick, 1966; Taylor & Hastman, 1956). In one interesting study, Johnson and Vidulich (1956) suggested that we could minimize halo by having raters evaluate every ratee on one dimension, and then move to the next dimension and evaluate every ratee on that one as well. (The interested reader should compare this to the notion of ātask-blockedā information, which will be discussed in the next chapter.) These authors suggested that this could reduce halo error, although a subsequent re-analysis of the data (Johnson, 1963) concluded that the differences in halo attributable to the two rating methods were non-significant.
There was some concern that it was mistaken to rely upon these subjective evaluations, regardless of the particular scale used. Instead, it was argued, evaluations should be based upon more objective measures of productivity and output such as scrap rates and time required to complete a task (e.g., Rothe, 1946a, 1946b, 1949, 1951, 1978). But such measures are not available for many jobs, and scholars who have correlated objective measures with more subjective evaluations have found the correlations to be modest to low (e.g., Gaylord, Russell, Johnson, & Severin, 1951; Severin, 1952). Of course, these findings may simply confirm the belief that subjective models do not really capture the essence of performance on the job, but there is an alternative explanation as well. Wexley and Klimoski (1984) have argued that there is no ātrueā performance. Instead, all we have are different indicators of true performance, and measures such as output and performance ratings simply tap different aspects of performance and reflect different models of performance, termed outcome and person models. If we want to focus on only outcomes and not be concerned with understanding how we got the outcomes we did, or how to improve performance, then an outcome model would make sense. But, if we want to also understand how we got the level of performance we did so that we can work to improve performance, then we must focus on the person model, and performance appraisals, subjective though they may be, become important.
But these were digressions. The main thrust of the research was still on developing better rating instruments so that we could reduce rating errors. It is true that several authors reported disturbing findings, such as the failure to find any one rating format superior to another (e.g., Taylor & Hastman, 1956), or the fact that rating format seemed to have less effect on halo than did rater training (Brown, 1968), but these only seemed to spur researchers even further to develop even better rating instruments.
In fact, the range of alternative proposals was fairly amazing (see Landy & Farr, 1980, for a complete review of these developments). For example, John Flanagan proposed an appraisal method (still very much in use today) which focused attention on the identification of incidents of especially good or especially poor performance, called critical incidents, rather than on the entire set of behaviors that a ratee might exhibit on the job (Flanagan, 1949, 1954; Flanagan & Burns, 1955). At about the same time, Sisson (1948) proposed that we adopt a methodology commonly used in personality measurement to performance appraisal. He demonstrated techniques for determining two (or more) descriptive items which were equally desirable or undesirable, but only one of which has been determined (empirically) to discriminate between good and poor performers on the job. This Forced Choice technique would then present raters with the pair of items and ask raters to choose the one which best described the ratee. Only if the rater selected the ādiscriminatingā item did the ratee receive any credit for the rating. This was supposed to prevent raters from providing overly lenient ratings, but had the effect of communicating to the rater that he or she could not be trusted. Forced Distribution methods (which Schmidt & Johnson, 1973, argue are particularly effective with large numbers of ratees) force a rater to limit the number of ratees assigned to each rating category. Although any distribution could be āforcedā in this way, it is most commonly used to ensure that ratings are normally distributed, eliminating leniency, central tendency, and any other distributional āerrorā (cf., Berkshire & Highland, 1953).
A major innovation came in the 1960s with Behaviorally Anchored Rat ing Scales (BARS; Smith & Kendall, 1963). Here, the raters were completely involved in the development of the scales and the anchors to be used, and those anchors were expressed in terms of specific behaviors rather than adjectives such as āpoorā or āoutstanding.ā These scales received a great deal of attention in the literature, as researchers proposed alternative ways to develop and implement these scales (e.g., Arvey & Hoyle, 1974; Bernardin, LaShells, Smith, & Alvares, 1976), and also presented a number of comparisons of these scales with other scale formats (e.g., Bernardin, Alvares, & Cranny, 1976; Campbell, Dunnette, Arvey, & Hellervik, 1973; Dickinson & Zellinger, 1980). There was also a variant proposed, called Behavioral Observation Scales (BOS; e.g., Kane & Bernardin, 1982; Latham, Fay, & Saari, 1979; Latham & Wexley, 1977) which required raters to observe and note behaviors rather than evaluate them. It will be interesting to think back to this proposal later, when in Chapter 6 I discuss the different definitions of rating accuracy, which include something called behavioral accuracy.
Other proposals over the years included Mixed Standard Rating Scales (Blanz & Ghiselli, 1972) which required raters to indicate whether a ratee performed below, above, or at the level of performance described by a series of statements, and which scatters the items relevant for a given dimension throughout the questionnaire, and a rather different approach called Performance Distribution Assessment (PDA; Kane, 1982a, 1982b) which is concerned that raters consider the variance in performance over time as well as the level of performance most typically displayed. All of these efforts led to a long line of studies dedicated to determining which format was superior. None of the studies actually compared all the formats, but the numerous comparative studies failed to reveal one clearly superior type of scale (see Landy & Farr, 1980, for a review). It is important to note here that the basis for these comparisons was usually the sameāthe level of psychometric errors present in the data, and perhaps some consideration of interrater reliability. Clearly, we were still willing to assume that the absence of rating errors indicated more accurate ratings. Thus, by the time Landy and Farr wrote their review paper, there had been a great deal of research on rating scale content and format, and the sum of this research was to note that no one scale was superior to others in terms of reducing errors or increasing interrater reliability.
Before discussing how appraisal research changed following the publication of Landy and Farrās paper, it is important to take note of another trend in the literature that was developing at the same time as the literature on rating scales. For much of the same time, research was also being conducted on the role of rater training. For the most part, rater training programs were designed to familiarize raters with the rating instruments and the jobs in question (e.g., Brown, 1968), but by the 1970s, this emphasis had shifted towards training programs designed to reduce the incidence of rating errors such as halo and contrast effects (e.g., Latham, Wexley, & Pursell, 1975; Wexley, Sanders, & Yukl, 1973). Following much the same logic as was driving scale format research, the design of training research was based on the premise that reducing psychometric errors was actually a proxy for improving accuracy, so that a program which successfully reduced the incidence of errors was assumed to also increase accuracy. But, as the emphasis in the field moved to one where there was more concern for rater decision making, and as scholars became more suspicious of the relationship between errors and rating accuracy, the focus of training research changed as well. In the ācognitiveā era, training designed to make raters better raters by providing clear standards and instruction on their use (Frame of Reference Training; e.g., Bernardin & Buckley, 1981; Pulakos, 1984, 1986) be came much more prevalent and we saw the appearance of programs designed to sensitize raters to errors in processing as well (e.g., Steiner, Dobbins, & Trahan, 1991), while the criteria for assessing the effectiveness of these programs moved away from rating errors and towards rating accuracy.
THE COGNITIVE SHIFT IN APPRAISAL RESEARCH
It is clearly the case that Landy and Farrās review paper shifted the focus of researchers away from rating scale format and er...
Table of contents
- Cover Page
- Title Page
- Copyright Page
- Illustration
- Preface
- Chapter 1 Why a cognitive approach?
- Chapter 2 A cognitive model of the appraisal process
- Chapter 3 Research on information acquisition processes
- Chapter 4 Reprocessing objectives and interventions designed to impose organization in memory
- Chapter 5 Other factors and other cognitive processes
- Chapter 6 Cognitive research moves to the field
- Chapter 7 Implications for theory and practice
- Chapter 8 Where do we go from here?
- Bibliography
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, weāve got you covered! Learn about our mission
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud
Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere ā even offline. Perfect for commutes or when youāre on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app
Yes, you can access A Cognitive Approach to Performance Appraisal by Angelo DeNisi in PDF and/or ePUB format, as well as other popular books in Psychology & Cognitive Psychology & Cognition. We have over one million books available in our catalogue for you to explore.