In full:
Chat Generative Pre-training Transformer

ChatGPT, software that allows a user to ask it questions using conversational, or natural, language. It was released on November 30, 2022, by the American company OpenAI and almost immediately disturbed academics, journalists, and others because of concern that it was impossible to distinguish human- from ChatGPT-generated writing.

Language models produce text based on the probability for a word to occur based on previous words in the sequence. By being trained on about 45 terabytes of text from the Internet, the GPT-3 language model used by ChatGPT calculates that some sequences of words are more likely to occur than others. For example, “the cat sat on the mat” is more likely to occur in English than “sat the the mat cat on” and thus would be more likely to appear in a ChatGPT response.

What do you think?

Explore the ProCon debate

ChatGPT refers to itself as “a language model developed by OpenAI, a leading artificial intelligence research lab.” The model is based on the “GPT (Generative Pre-training Transformer) architecture, which is a type of neural network designed for natural language processing tasks.” ChatGPT says its primary purpose “is to generate human-like text, which can be used for a variety of applications, such as chatbots, automated content creation, and language translation.”

AI, Machine learning, Hands of robot and human touching big data of Global network connection, Internet and digital technology, Science and artificial intelligence, futuristic digital technologies.
Britannica Quiz
ProCon’s Artificial Intelligence (AI) Quiz

It continues by saying “The model can understand and respond to user input in a way that mimics human conversation, allowing for more natural and engaging interactions. Additionally, ChatGPT can generate text in a variety of styles and formats, such as news articles, emails, and poetry, making it versatile and useful for a wide range of applications.” For example, when asked to produce a haiku about Encyclopædia Britannica, ChatGPT generated:

Encyclopedia old
Endless knowledge to behold
Wisdom in its pages.

(However, this haiku has seven syllables instead of five on the first line.)

ChatGPT impressed many with its command of written English and as a demonstration of how far artificial intelligence (AI) had advanced. Within five days of its introduction, more than one million users had signed up for a free account to interact with ChatGPT. The software showed that it could pass exams in advanced courses. For example, Wharton Business School professor Christian Terwiesch found that ChatGPT passed the final exam in his course in operations management; however, on some questions it made “surprising mistakes in relatively simple calculations at the level of 6th grade math.” Educators became concerned that students would cheat by having ChatGPT write their essays, with some even proposing that essays should no longer be done as homework assignments. The American media company Buzzfeed announced that it would use OpenAI tools, such as ChatGPT, to produce content, such as quizzes, that would be personalized for readers.

In 1950 British mathematician Alan Turing proposed a test for assessing whether a computer can be described as thinking. A human questioner interrogates both a human subject and a computer. By means of a series of such tests, a computer’s success at “thinking” can be measured by its probability of being misidentified as the human subject. Buzzfeed data scientist Max Woolf said that ChatGPT had passed the Turing test in December 2022, but some experts claim that ChatGPT did not pass a true Turing test because in ordinary usage ChatGPT often states that it is a language model.

Are you a student?
Get a special academic rate on Britannica Premium.

Although ChatGPT had many strong traits, it also had some surprising weaknesses. The model can add two-digit numbers (e.g., 23 + 56) with complete accuracy, but for multiplying two-digit numbers (e.g., 23 × 56), it produces the right answer only about 30 percent of the time.

Like other large language models, ChatGPT can sometimes “hallucinate,” a term used to describe the tendency for such models to respond with inaccurate or misleading information. For example, ChatGPT was asked to tell the Greek myth of Hercules and the ants. There is no such Greek myth; nevertheless, ChatGPT told a story of Hercules learning to share his resources with a colony of talking ants when marooned on a desert island. When asked if there really was such a Greek myth, ChatGPT apologized and replied that there was no such myth but that it had created a fable based on its understanding of Greek mythology. When further asked why it made up such a myth instead of simply saying that there was no such myth, it apologized again and said that “as a language model, my main function is to respond to prompts by generating text based on patterns and associations in the data I’ve been trained on.” ChatGPT tends not to say that it does not know an answer to a question but instead produces probable text based on the prompts given to it.

ChatGPT is, at least, forthright about its limitations. When asked if it is a reliable source of information, it replies that “it is not recommended to rely on ChatGPT as a sole source of factual information. Instead, it should be used as a tool to generate text or complete language-based tasks, and any information provided by the model should be verified with credible sources.” Even answers to questions about computer programming languages, an unlikely source of hallucinations, have proved inaccurate so often that the popular programming question-and-answer site Stack Overflow temporarily banned answers from ChatGPT.

Erik Gregersen

News

AI-powered BEACON platform enhances global disease surveillance Apr. 25, 2025, 1:25 AM ET (News-Medical)
One Prompt Can Bypass Every Major LLM’s Safeguards Apr. 24, 2025, 4:10 AM ET (Forbes)
Yann LeCun, a legend in AI, thinks LLMs are nearly obsolete Apr. 3, 2025, 7:27 AM ET (Newsweek)

large language model (LLM), a deep-learning algorithm that uses massive amounts of parameters and training data to understand and predict text. This generative artificial intelligence-based model can perform a variety of natural language processing tasks outside of simple text generation, including revising and translating content.

What do you think?

Explore the ProCon debate

Underlying mechanisms

The word large refers to the parameters, or variables and weights, used by the model to influence the prediction outcome. Although there is no definition for how many parameters are needed, LLM training datasets range in size from 110 million parameters (Google’s BERTbase model) to 340 billion parameters (Google’s PaLM 2 model). Large also refers to the sheer amount of data used to train an LLM, which can be multiple petabytes in size and contain trillions of tokens, which are the basic units of text or code, usually a few characters long, that are processed by the model.

LLMs aim to produce the most probable outcome of words for a given prompt. Smaller language models, such as the predictive text feature in text-messaging applications, may fill in the blank in the sentence “The sick man called for an ambulance to take him to the _____” with the word hospital. LLMs function in the same way but on a much larger, more nuanced scale. Instead of predicting a single word, an LLM can predict more-complex content, such as the most likely multi-paragraph response or translation.

An LLM is initially trained with textual content. The training process may involve unsupervised learning (the initial process of forming connections between unlabeled and unstructured data) as well as supervised learning (the process of fine-tuning the model to allow for more targeted analysis). Once training is complete, LLMs undergo the process of deep learning through neural network models known as transformers, which rapidly transform one type of input to a different type of output. Transformers take advantage of a concept called self-attention, which allows LLMs to analyze relationships between words in an input and assign them weights to determine relative importance. When a prompt is input, the weights are used to predict the most likely textual output.

History

The first language models, such as the Massachusetts Institute of Technology’s Eliza program from 1966, used a predetermined set of rules and heuristics to rephrase users’ words into a question based on certain keywords. Such rule-based models were followed by statistical models, which used probabilities to predict the most likely words. Neural networks built upon earlier models by “learning” as they processed information, using a node model with artificial neurons. Nodes were activated based on other nodes’ output.

The first large language models emerged as a consequence of the introduction of transformer models in 2017. The new speeds provided by transformers allowed for even more parameters and data to be incorporated into models, paving the way for the introduction of the first LLMs, which included Google’s BERT (Bidirectional Encoder Representations from Transformers) and OpenAI’s GPT (Generative Pre-trained Transformer), the following year.

LLMs improved their task efficiency in comparison with smaller models and even acquired entirely new capabilities. These “emergent abilities” included performing numerical computations, translating languages, and unscrambling words. LLMs have become popular for their wide variety of uses, such as summarizing passages, rewriting content, and functioning as chatbots. Some LLMs are even able to generate captions for inputted images.

Are you a student?
Get a special academic rate on Britannica Premium.

LLMs can be used by computer programmers to generate code in response to specific prompts. Additionally, if this code snippet inspires more questions, a programmer can easily inquire about the LLM’s reasoning. Much in the same way, LLMs are useful for generating content on a nontechnical level as well. LLMs may help to improve productivity on both individual and organizational levels, and their ability to generate large amounts of information is a part of their appeal.

Complications and concerns

LLMs have a number of drawbacks. The models are incredibly resource intensive, sometimes requiring up to hundreds of gigabytes of RAM. Moreover, their inner mechanisms are highly complex, leading to troubleshooting issues when results go awry. Occasionally, LLMs will present false or misleading information as fact, a common phenomenon known as a hallucination. A method to combat this issue is known as prompt engineering, whereby engineers design prompts that aim to extract the optimal output from the model.

Numerous ethical and social risks still exist even with a fully functioning LLM. A growing number of artists and creators have claimed that their work is being used to train LLMs without their consent. This has led to multiple lawsuits, as well as questions about the implications of using AI to create art and other creative works. Models may perpetuate stereotypes and biases that are present in the information they are trained on. This discrimination may exist in the form of biased language or exclusion of content about people whose identities fall outside social norms. Other issues outlined by experts include information hazards, wherein LLMs may disclose private information present in training data; malicious use, wherein bad actors use the models to bolster disinformation campaigns or commit fraud; and economic harms, in which LLMs may displace workers and widen inequality gaps between those with access to the technology and those without such access.

Michael McDonough