Large Language Model (LLM)
TL;DR
A type of AI trained on vast amounts of text data, capable of generating, summarising, translating, and reasoning about language.
A Large Language Model (LLM) is a type of artificial intelligence model trained on billions or trillions of text tokens, enabling it to understand and generate human language with remarkable fluency. LLMs use transformer architecture — a neural network design that processes relationships between words in context rather than sequentially.
Popular LLMs include GPT-4 (OpenAI), Claude (Anthropic), Gemini (Google), LLaMA (Meta), and Mistral. These models are the foundation of generative AI applications: chatbots, AI writing assistants, code generators, and document summarisers.
For businesses, LLMs enable a new generation of applications: customer service agents that handle nuanced queries, document analysis tools that extract key information from contracts, and content systems that generate first drafts for human editorial review. Integrating LLMs into products typically involves the OpenAI or Anthropic API, prompt engineering, and context management — areas where AI development agencies specialise.
Examples in Practice
GPT-4 (ChatGPT), Claude 3 Opus, Gemini 1.5 Pro, LLaMA 3, Mistral 7B.