Section 1: Understanding GPT-3 and the OpenAI API
The objective of this section is to provide you with a high-level introduction to GPT-3 and the OpenAI API and to show how easy it is to get started with. The goal is to engage you with fun examples that are quick and simple to implement.
This section comprises the following chapters:
- Chapter 1, Introducing GPT-3 and the OpenAI API
- Chapter 2, GPT-3 Applications and Use Cases
Chapter 1: Introducing GPT-3 and the OpenAI API
The buzz about Generative Pre-trained Transformer Version 3 (GPT-3) started with a blog post from a leading Artificial Intelligence (AI) research lab, OpenAI, on June 11, 2020. The post began as follows:
We're releasing an API for accessing new AI models developed by OpenAI. Unlike most AI systems which are designed for one use-case, the API today provides a general-purpose "text in, text out" interface, allowing users to try it on virtually any English language task.
Online demos from early beta testers soon followedāsome seemed too good to be true. GPT-3 was writing articles, penning poetry, answering questions, chatting with lifelike responses, translating text from one language to another, summarizing complex documents, and even writing code. The demos were incredibly impressiveāthings we hadn't seen a general-purpose AI system do beforeābut equally impressive was that many of the demos were created by people with a limited or no formal background in AI and Machine Learning (ML). GPT-3 had raised the bar, not just in terms of the technology, but also in terms of AI accessibility.
GPT-3 is a general-purpose language processing AI model that practically anybody can understand and start using in a matter of minutes. You don't need a Doctor of Philosophy (PhD) in computer scienceāyou don't even need to know how to write code. In fact, everything you'll need to get started is right here in this book. We'll begin in this chapter with the following topics:
- Introduction to GPT-3
- Democratizing NLP
- Understanding prompts, completions, and tokens
- Introducing Davinci, Babbage, Curie, and Ada
- Understanding GPT-3 risks
Technical requirements
This chapter requires you to have access to the OpenAI Application Programming Interface (API). You can register for API access by visiting https://openai.com/.
Introduction to GPT-3
In short, GPT-3 is a language model: a statistical model that calculates the probability distribution over a sequence of words. In other words, GPT-3 is a system for guessing which text comes next when text is given as an input.
Now, before we delve further into what GPT-3 is, let's cover a brief introduction (or refresher) on Natural Language Processing (NLP).
Simplifying NLP
NLP is a branch of AI that focuses on the use of natural human language for various computing applications. NLP is a broad category that encompasses many different types of language processing tasks, including sentiment analysis, speech recognition, machine translation, text generation, and text summarization, to name but a few.
In NLP, language models are used to calculate the probability distribution over a sequence of words. Language models are essential because of the extremely complex and nuanced nature of human languages. For example, pay in full and painful or tee time and teatime sound alike but have very different meanings. A phrase such as she's on fire could be literal or figurative, and words such as big and large can be used interchangeably in some cases but not in othersāfor example, using the word big to refer to an older sibling wouldn't have the same meaning as using the word large. Thus, language models are used to deal with this complexity, but that's easier said than done.
While understanding things such as word meanings and their appropriate usage seems trivial to humans, NLP tasks can be challenging for machines. This is especially true for more complex language processing tasks such as recognizing irony or sarcasmātasks that even challenge humans at times.
Today, the best technical approach to a given NLP task depends on the task. So, most of the best-performing, state-of-the-art (SOTA) NLP systems are specialized systems that have been fine-tuned for a single purpose or a narrow range of tasks. Ideally, however, a single system could successfully handle any NLP task. That's the goal of GPT-3: to provide a general-purpose AI system for NLP. So, even though the best-performing NLP systems today tend to be specialized, purpose-built systems, GPT-3 achieves SOTA performance on a number of common NLP tasks, showing the potential for a future general-purpose NLP system that could provide SOTA performance for any NLP task.
What exactly is GPT-3?
Although GPT-3 is a general-purpose NLP system, it really just does one thing: it predicts what comes next based on the text that is provided as input. But it turns out that, with the right architecture and enough data, this one thing can handle a stunning array of language processing tasks.
GPT-3 is the third version of the GPT language model from OpenAI. So, although it started to become popular in the summer of 2020, the first version of GPT was announced 2 years earlier, and the following version, GPT-2, was announced in February 2019. But even though GPT-3 is the third version, the general system design and architecture hasn't changed much from GPT-2. There is one big difference, however, and that's the size of the dataset that was used for training.
GPT-3 was trained with a massive dataset comprised of text from the internet, books, and other sources, containing roughly 57 billion words and 175 billion parameters. That's 10 times larger than GPT-2 and the next-largest language model. To put the model size into perspective, the average human might read, write, speak, and hear upward of a billion words in an entire lifetime. So, GPT-3 has been trained on an estimated 57 times the number of words most humans will ever process.
The GPT-3 language model is massive, so it isn't something you'll be downloading and dabbling with on your laptop. But even if you could (which you can't because it's not available to download), it would cost millions of dollars in computing resources each time you wanted to build the model. This would put GPT-3 out of reach for most small companies and virtually all individuals if you had to rely on your own computer resource to use it. Thankfull...