1
COGNITIVE LOAD THEORY
John Sweller
In this introductory chapter I discuss how human cognitive architecture allows us to learn, think and solve problems and how that architecture can be used to design instruction. Cognitive load theory (Sweller, 2015, 2016; Sweller, Ayres, & Kalyuga, 2011) is underpinned by our understanding of human cognitive architecture to generate hypotheses of potentially novel instructional procedures. The effectiveness of those procedures is tested using randomised, controlled trials with current instructional procedures providing control conditions. Characteristically, to avoid altering multiple variables simultaneously, multiple controlled trials are required and carried out. The result, over more than 30 years, is a theory that has continually evolved as new data and new concepts become available, delivering a large corpus of cognitive load effects that provide efficacious instructional procedures. The chapters of this volume reflect a continuation of this tradition.
In its current formulation, the cognitive architecture used by cognitive load theory is heavily influenced by evolutionary psychology, used both to indicate that category of knowledge that is amenable to instruction and to provide the structures and functions of human cognition that deal with instructable knowledge. Together, knowledge categories along with the structures and functions of human cognitive architecture based on evolutionary psychology provide an important part of the theoretical base for cognitive load theory. In this introduction, both will be considered separately.
Categories of knowledge
Knowledge can be categorised in many different ways but very few categories have instructional implications. One categorisation scheme that has important instructional implications is based on evolutionary educational psychology (Geary, 2008, 2012; Geary & Berch, 2016; Paas & Sweller, 2012).
Geary divides knowledge into biologically (or evolutionary) primary and secondary categories. We have specifically evolved to acquire biologically primary knowledge. Examples are learning to listen to and speak a native language, learning general problem-solving strategies such as generalising from one problem solution to another, or learning to recognise faces. There are several important characteristics of biologically primary knowledge. It is modular with only limited overlap between one category of primary knowledge and another. The modularity of primary skills is due to the differing epochs during which each skill evolved. For example, our ability to learn to listen to and speak a native language and our ability to learn to generalise problem solutions to other similar problems are likely to have evolved independently at different times.
Another important characteristic of biologically primary knowledge is that much of it is generic-cognitive. A generic-cognitive skill is a basic, mental skill that applies to a variety of domains. Learning to use a general problem-solving strategy provides an example of a generic-cognitive skill. As a more specific example, a means-ends strategy (Newell & Simon, 1972), in which we attempt to reduce the differences between where we are in a problem and the goal of a problem, is used by most animals attempting to reach a food source. We, along with many animals, have evolved to acquire that skill and many other generic-cognitive skills.
Because we have evolved to automatically acquire biologically primary skills such as generic-cognitive skills, they can be learned by most people but cannot be taught (Tricot & Sweller, 2014). Due to the vital importance of most generic-cognitive skills, it is easy to assume that they should be the subject of instruction. For example, there is frequent advocacy for teaching general problem-solving skills that apply to a variety of unrelated problems. An example is the means-ends strategy described in the previous paragraph. Despite the importance of generic-cognitive skills and the advocacy in favour of teaching them, there is very limited evidence that teaching them improves general performance over a range of tasks (Sala & Gobet, 2017). Of course, if we have evolved to acquire generic-cognitive skills without tuition, attempts to teach such skills are likely to be futile (Sweller, 2015; Tricot & Sweller, 2014).
Biologically secondary knowledge differs from primary knowledge in several instructionally relevant areas, beginning with the fact that unlike primary knowledge, it is instructable. Indeed, the clearest examples of biologically secondary knowledge can be found in educationally relevant areas. Almost everything taught in educational contexts consists of biologically secondary knowledge. Schools and other educational institutions were invented because of the need to teach biologically secondary areas.
Unlike primary knowledge, secondary knowledge is not modular and so we have not specifically evolved to acquire various categories of secondary knowledge. Instead, similar procedures and the same cognitive architecture are used to acquire all types of secondary knowledge. There are two other major differences between biologically primary and secondary knowledge. While, as indicated above, much biologically primary knowledge is generic-cognitive, secondary knowledge is largely domain-specific. We may have evolved to solve a large variety of problems using a generic-cognitive problem-solving strategy such as means-ends analysis but we are most unlikely to have evolved to solve an algebra problem such as (a + b)/c = d, solve for a, by multiplying out the denominator on the left-hand side as the first move. That strategy is specific to a particular class of algebra problem and of no use when solving any other category of problems. Unlike generic-cognitive skills, it needs to be taught and deliberately learned because we have not specifically evolved to acquire this strategy.
In addition, biologically secondary, domain-specific skills need to be explicitly taught and deliberately learned (Kirschner, Sweller, & Clark, 2006; Sweller, Kirschner, & Clark, 2007). Because we have not specifically evolved to acquire them, they will not be automatically learned like biologically primary, generic-cognitive skills. As a consequence, explicit instruction is vital in most areas. The frequent failure of our field to distinguish between biologically primary, generic-cognitive knowledge and biologically secondary, domain-specific knowledge has had the unfortunate consequence that many instructional designers and many instructional theories assume that because we are able to learn easily and automatically without explicit instruction outside of educational contexts, that the same limited guidance procedures are educationally superior to explicit instruction.
There are differences between how we learn within an education context as opposed to learning in environments external to formal education, but those differences are due to the different categories of knowledge being dealt with. An example of the problems that can arise by ignoring the distinction between biologically primary, generic-cognitive and biologically secondary, domain-specific skills, can be found by considering the examples above of learning to use a general problem-solving strategy such as means-ends analysis externally to an education context as opposed to learning to solve an algebraic equation. The generic means-ends strategy is never explicitly taught because it does not need to be taught. Because of its importance, it is acquired automatically as a biologically primary skill and so it would be futile to attempt to teach it. In contrast, learning that we need to multiply out the denominator when solving a particular algebra problem is a domain-specific, biologically secondary skill that we will not learn unless it is taught. Assuming that the two skills belong to the same category and are acquired in the same way and so should be taught in the same way is likely to result in instructionally flawed procedures.
Human cognitive architecture
Cognitive load theory is mainly concerned with biologically secondary information because most educationally relevant areas deal with such information. Biologically secondary information is processed, stored and used according to a specific cognitive architecture. That architecture is used by cognitive load theory and can be outlined by five basic structures and functions. It should be noted that the same processes govern the functioning of evolution by natural selection (Sweller & Sweller, 2006). Both human cognition and evolution by natural selection are examples of natural information processing systems and as a consequence, the two systems are closely analogous.
Long-term memory structure and function. Knowledge is the driver of intellectual performance, and knowledge is stored in schematic form in long-term memory. Our ability to store knowledge is a biologically primary skill. Curiously, until relatively recently, intellectual performance tended to be associated more with terms such as “thinking”, “problem-solving” or “creativity” rather than “knowledge”. Even more curiously, we have known of the centrality of knowledge to problem-solving skill for a very long time. De Groot (1965), in a book first published in 1946, demonstrated that the only obvious difference between more able and less able chess players was their knowledge of large numbers of chess board configurations. A good chess player has learned to recognise chess board configurations and the best moves associated with them. Chess is a game of problem solving but problem-solving skill can be entirely explained by biologically secondary, domain-specific knowledge held in long-term memory. No other explanation has been required for over the last 70 years. The same base of biologically secondary, domain-specific skill can be assumed to apply to every acquired, knowledge-based area including areas taught in educational institutions. Skilled performers in all such areas have acquired enormous stores of schematic knowledge held in long-term memory. Accordingly, instructional procedures need to place their emphasis on knowledge acquisition.
Borrowing and reorganising function. How is knowledge best acquired? Humans have evolved to acquire knowledge, including biologically secondary knowledge, from others. It is a biologically primary skill. We are one of the few species to have evolved to both present information to and acquire information from other members of the species. Instruction should facilitate that process by an appropriate organisation of written, spoken and diagrammatic material. Cognitive load theory was devised to assist in the realisation of this goal of effective and efficient knowledge transfer between people.
Randomness as genesis function. While the most effective method of obtaining information is from other people either directly or via media, sometimes that information is unavailable either because it cannot be accessed or because it has not yet been invented. Under those circumstances, problem solving can provide an alternative route. All problem-solving strategies consist of an amalgam of knowledge and random generation and test. We will use knowledge as far as possible to solve problems but if knowledge cannot be used to generate a problem-solving move, we have no choice but to randomly generate a move and test it for effectiveness. We do not need to be taught this procedure because it is biologically primary. Random generation and test can work but it is slow and inefficient compared to obtaining relevant information from another person. In instructional contexts, it should not be used as a substitute for providing learners with information.
Working memory structure and function when processing novel information. After it is perceived, novel information obtained from the external environment is first processed by working memory. Information can be obtained either from others via the borrowing and reorganising function or during problem solving via the randomness as genesis function. Once processed, if potentially useful subsequently, it can be stored in long-term memory. When dealing with novel information, this structure is characterised by its very limited capacity (Cowan, 2001; Miller, 1956) and duration (Peterson & Peterson, 1959). No more than 3–4 items of information can be processed in working memory at a time and cannot be held without rehearsal for more than about 20 seconds. This structure is central to cognitive load theory. By definition, learners deal with novel information. Instructional procedures that ignore this structure are likely to be random in their effectiveness.
Working memory structure and function when processing familiar information. Working memory not only processes novel information obtained via the senses, it also processes familiar information that has been stored in long-term memory using the previously described structures and functions. Unlike working memory when dealing with novel information, there are no known limits when working memory deals with organised, familiar information transferred from long-term memory. When activated by appropriate environmental signals, working memory can transfer large amounts of automated information from long-term memory, keep that information active in working memory for indefinite periods, and so allow the generation of action appropriate to that environment. This function of working memory when processing familiar information stored in long-term memory provides the ultimate justification of the cognitive system and explains the importance of education. Our ability to effortlessly and appropriately access and process huge amounts of automated, biologically secondary information explains the transformational consequences of education.
Instructional consequences of human cognitive architecture
The above cognitive architecture provides the base for the instructional design procedures used by cognitive load theory. Assuming that architecture, the major function of instruction is to alter the contents of long-term memory. Once altered, that information can be transferred to working memory transforming the ability of learners to function in a particular environment. They can derive meaning from print where others only see apparently random squiggles; they can immediately and effortlessly solve mathematical problems that others find impossibly complex. In general, what is otherwise meaningless and unintelligible can become obvious, routine and automatic. Of course, before reaching this state, novel, biologically secondary information must be transferred to long-term memory from the environment via a severely restricted working memory. Facilitating this process is the primary function of cognitive load theory. In order to do so, further categorisation of information using the concept of element interactivity is required.
Element interactivity
Element interactivity is a central concept of cognitive load theory (Sweller, 2010). Some information consists of individual, largely unrelated elements of information. Where elements of information are unrelated, they can be processed with minimal or no reference to each other. For example, if you need to learn some of the nouns of a foreign language or the symbols of the chemical periodic table, each element can be learned without reference to any other element. Element interactivity is low. In contrast, at the other extreme, some information can be very high in element interactivity because each element interacts with a multitude of other elements. For example, if a student must learn how to balance a chemical equation or solve an algebra problem, no element can vary without affecting many of the other elements. Any change to a pro-numeral in an algebra equation has consequences for the entire equation.
These differences in element interactivity have working memory consequences. Learning the nouns of a foreign language is likely to be a very difficult, time-consuming task but it does not impose a heavy working memory load because element interactivity is low. In contrast, learning to solve an algebra problem such as (a + b)/c = d, solve for a, involves far fewer eleme...