What is a Large Language Model?

The one-sentence definition

A Large Language Model is a neural network — almost always a transformer — trained on a vast corpus of text to predict the next token given the tokens that came before it.

That single objective, applied at massive scale, is the foundation everything else is built on: chat, code generation, reasoning, agents, tool use.

Why this framing matters

It is easy to anthropomorphise a model that produces fluent English. A more useful mental model is:

Input: a sequence of tokens (sub-word pieces).
Output: a probability distribution over the next token.
Generation: sample one token, append it, repeat.

Everything an LLM appears to do — answer a question, write code, plan a trip — is the same operation: sample the next token under the conditioning of everything written so far.

Why "large"

Three things grew together to make LLMs work:

Data — trillions of tokens of web, code, and books.
Compute — large GPU/TPU clusters trained for months.
Parameters — tens of billions of learned weights.

The Chinchilla paper (2022) showed model size and training tokens should scale together — undertrained big models waste compute.

What this is not

An LLM is not a database, a search engine, or a deterministic program. It is a learned probability distribution. Treat its outputs as probabilistic — verify before you ship.