Psychometrics in the 21st century
Psychometrics depends on the availability of data on a large scale, and so it is no surprise that the advent of the internet has massively boosted its influence. If we had to date the internet, we would probably start at CERN, the European Organization for Nuclear Research, in Geneva, with Tim Berners-Leeâs invention of the World Wide Web in 1990; he linked the newly developed hypertext markup language (HTML) to a graphic user interface (GUI), thereby creating the first web pages. Since then, the web has expanded to make Marshall McLuhanâs âglobal villageâ a reality (McLuhan, 1964). The population of this global village grew from a handful of academics in the early 1990s to a diverse and vibrant community of one billion users in 2005, and to over four billion users (representing more than 50% of the worldâs population) in 2020. Thus, within less than 20 years, the new medium of cyberspace came into existence, creating a completely new science with new disciplines, new experts, and, of course, new problems. Some aspects of this new science are exceptional. While the science of biology is only 300 years old, and that of psychology considerably younger, both their subjects of studyâhumans and life itselfâhave existed for millions of years. Not so the internet. Hence the cyberworld is unique, and it is hard to predict what to expect of its future. It is also a serious disruptor; it has completely changed the nature of its adjacent disciplines, especially computer and information sciences, but also psychology and its progeny, psychometrics.
By the year 2000, the migration of psychometrics into the online world was well underway, producing both new opportunities and new challenges, particularly for global examination organizations such as the Educational Testing Service (ETS) at Princeton and Cambridge Assessment in the UK. On the positive side, gone were the massive logistical problems involved in securely delivering and recovering huge numbers of examination papers by road, rail, and air from remote parts of the world. But the downside was that examinations needed to take place at fixed times during the school or working day, and it became possible for candidates in, say, Singapore to contact their friends in, say, Mexico with advance knowledge of forthcoming questions. Opportunities for cheating were rife. To counter these challenges, the major examination boards and test publishers turned to the advantages offered by large item banks and computer adaptive testing, the psychometriciansâ own version of machine learning. However, it was the development of the appâan abbreviation of âapplicationâ used to describe a piece of software that can be run through a web browser or on a mobile phoneâthat was to prove the most disruptive to traditional ways of thinking about psychometric assessment.
One such app was David Stillwellâs myPersonality, published on Facebook in 2007 (Stillwell, 2007; Kosinski, Stillwell, & Graepel, 2013; Youyou, Kosinski, & Stillwell, 2015). It offered its users a chance to take a personality test, receive feedback on their scores, and share those scoresâif they were so inclinedâwith their Facebook friends. It was similar to countless other quizzes widely shared on Facebook around that time, yet it employed an established and well-validated personality test taken from the International Personality Item Pool (IPIP), an open-source repository established in the 1990s for academic use as a reaction to test publishersâ domination of the testing world. The huge popularity of myPersonality was unforeseen. Within a few years, the app had collected over six million personality profiles, generated by enthusiasts who were interested to see the sort of results and feedback about themselves that had previously only been available to psychology professionals. It was one of psychometricsâ first encounters with the big-data revolution.
But the availability of psychometric data on such a grand scale was to have unexpected consequences. Many saw opportunities for emulating the procedure in online advertising, destined to become the major source of revenue for the digital industry. Once the World Wide Web existed, it could be searched or trawled by search engines, the most ubiquitous of which is Google. In the mid-1990s, search engines simply provided information. By 2010 they did so with a scope and accuracy that exceeded all previous expectations; information on anything or anyone was ripe for the picking. But those who wished to be found soon became active players on the sceneâit was the advertising industryâs new paradise. The battle to reach the top in search league tablesâor, at the very least, the first results pageâbegan in earnest. Once online advertising entered the fray, it became a new war zone. The battle for the keywords had begun. Marketing was no longer about putting up a board on the high street; it was about building a digital presence in cyberspace that would bring customers to you in droves. By the early 2000s, no company or organization could afford not to have a presence in cyberspace. For a high proportion of customers, companies without some digital presence simply ceased to exist.
While web pages were the first universally available data source in cyberspace, social networks soon followed, and these opened a whole new world of individualized personal information about their users that was available for exploitation. Not only was standard demographic information such as age, marital status, gender, occupation, and education available, but there were also troves of new data such as the words being used in status updates and tweets, images, music preferences, and Facebook Likes. And these data sources soon became delicious morsels in a new informational feeding frenzy. They were mined extensively by tech companies and the marketing industry to hone their ability to target advertisements to the most relevant audiencesâor, to put it another way, to those who might be most vulnerable to persuasion. The prediction techniques used were the same as those that had been used by psychometricians for decades: principal component analysis, cluster analysis, machine learning, and regression analysis. These were able to predict a personâs character and future behavior with far more accuracy than simple demographics. Cross-correlating demographics with traditional psychometric data, such as personality traits, showed that internet users were giving away much more information about their most intimate secrets than they realized. Thus, online psychographic targeting was born. This new methodology, creating clickbait and directing news feeds using psychological as well as demographic data, was soon considered to be far too powerful to exist in an unregulated world. But this will prove one day to have been just the midpoint in a journey that began many centuries ago.