What Is a Large Language Model?

Large Language Models Explained

Large language models aredeep learningalgorithms designed to train AI programs.

The algorithm outputs an appropriate, human-like response when presented with a text prompt.

The most popular applications of LLMs are AI chatbots.

A closeup of a phone screen with a variety of chatbot programs on it

Olivier Morin/AFP | Getty

Examples of large language models include GPT-4owhich powers the popularChatGPTand PaLM2, the algorithm behindGoogle Gemini.

Olivier Morin/AFP | Getty

Transformer models are made up of layers that can be stacked to create increasingly complex algorithms.

LLMs, in particular, rely on two key features of transformer models: positional encoding and self-attention.

Positional encoding allows the model to analyze text non-sequentially to identify patterns.

Self-attention assigns a weight to each input that determines its importance compared to the rest of the data.

That way, the model can pick out the most important parts in large amounts of text.

Grammatical rules are not preprogrammed into a large language model; the algorithm infers grammar as it reads text.

Likewise, LLMs don’t need to be trained for any specific skill.

Thus, LLMs have a lot of flexibility in understanding the nuances of human language.

On the other hand, LLMs require a lot of test data before they can be useful.

Training an LLM requires a lot of time and computational resources, resulting in high power bills.

Even though the learning process is unsupervised, human expertise is still needed to develop and maintain LLMs.