/FIELD NOTE

The OWASP LLM Top 10 (2025), Explained With Real Mitigations

9 January 2026 // 12 min read // Basalt Cyber Defense Division

The OWASP Top 10 for Large Language Model Applications has become the default starting point for teams shipping generative AI into production. The 2025 revision reflects what the industry actually saw break over the previous year: agentic systems with too much authority, retrieval pipelines poisoned through their own data sources, and chat interfaces that leaked the very prompts meant to constrain them. At Basalt Cyber we use this list as the backbone of our AI assurance engagements, but a list of risks is only useful if each one maps to a control you can implement and test. This article walks through all ten and pairs each with a mitigation you can put in front of an engineer this week.

Why the LLM Top 10 needs its own list

Traditional application security assumes a clear separation between code and data. Large language models collapse that boundary. Any text the model reads, whether it is a user message, a retrieved document, a tool output, or a system instruction, can influence behaviour. That single property is responsible for most of the entries on this list. If you internalise nothing else, internalise that an LLM treats all text in its context window as potentially instructive, and design from there.

LLM01: Prompt Injection

Prompt injection is the act of supplying input that overrides the developer's intended instructions. Direct injection comes from the user; indirect injection arrives through content the model retrieves or is fed automatically. There is no single filter that solves it, because the attack lives in the same channel as legitimate input.

Control: Enforce trust boundaries and least privilege rather than relying on input sanitisation alone. Tag every piece of context with its provenance, keep untrusted content out of the instruction position, and require explicit human confirmation before any high-impact action the model proposes. Treat the model's output as a suggestion, never as an authenticated command.

LLM02: Sensitive Information Disclosure

Models can surface training data, secrets pasted into prompts, or another user's data held in shared context. This is amplified when applications stuff API keys, internal URLs, or customer records directly into the prompt.

Control: Minimise what enters the context window. Redact PII before it reaches the model, never place live credentials in prompts, and apply output filtering for patterns like keys and account numbers. Where a system handles regulated data, scope sessions so one tenant's information can never appear in another tenant's context.

LLM03: Supply Chain

The LLM supply chain includes base models, fine-tuning datasets, adapters, embedding models, and the libraries that glue them together. A poisoned adapter from a public hub or an abandoned dependency is as dangerous here as a compromised package anywhere else.

Control: Maintain a model and dataset bill of materials. Pin versions, verify checksums and signatures, and source models only from providers you can attest to. Scan third-party model artifacts before deployment and treat any community-uploaded weights as untrusted until reviewed.

LLM04: Data and Model Poisoning

Poisoning manipulates training, fine-tuning, or retrieval data to embed backdoors or bias. In retrieval augmented generation, an attacker who can write to your knowledge base can change answers for everyone.

Control: Govern who can write to training and retrieval corpora. Validate and sign datasets, isolate ingestion pipelines, and run anomaly detection over embeddings to catch outliers that may be poisoned entries. For RAG, apply the same write controls you would apply to a production database.

LLM05: Improper Output Handling

This is the classic mistake of trusting model output downstream. If the model emits HTML rendered without escaping, SQL passed to a database, or shell commands executed by a tool, you have handed an attacker a path through the model into your systems.

Control: Treat every model output as untrusted input to the next system. Encode for the destination context, use parameterised queries, and never pass raw model text to an interpreter, an eval, or a shell. Constrain outputs to structured schemas where possible.

LLM06: Excessive Agency

Agency problems appear when a model is given tools, permissions, or autonomy beyond what the task needs. A summarisation assistant does not need to delete records, yet plenty of agents are wired with broad scopes for convenience.

Control: Apply least privilege to tools, not just data. Each tool should have the narrowest possible scope, destructive actions should require confirmation, and the agent's blast radius should be bounded so a single hijacked turn cannot cascade. We cover this in depth in our work on AI red teaming.

LLM07: System Prompt Leakage

The 2025 list formalised what red teamers had long exploited: system prompts get extracted, and they often contain secrets, business logic, or filter rules that should never have lived there in the first place.

Control: Assume the system prompt is public. Never place credentials, connection strings, or security-critical logic in it. Enforce authorisation in code, not in instructions, so that even a fully leaked prompt reveals nothing exploitable.

LLM08: Vector and Embedding Weaknesses

RAG and fine-tuning rely on embeddings, and those introduce their own attack surface: embedding inversion that recovers source text, cross-tenant leakage in shared vector stores, and injection through retrieved chunks.

Control: Partition vector stores per tenant, apply access controls at query time, and validate retrieved content before it enters the prompt. Monitor for retrieval that returns documents a user should not be able to see.

LLM09: Misinformation

Models produce confident, fluent, wrong answers. In high-stakes domains, an unverified hallucination is a liability, not just a quality issue.

Control: Ground responses in retrieved, citable sources and surface those citations. Add verification steps for factual claims in regulated workflows, and design the interface so users understand the model can be wrong. Human review belongs on any output that drives a decision.

LLM10: Unbounded Consumption

Without limits, attackers can drive runaway cost through expensive queries, recursive agent loops, or denial-of-wallet attacks that exhaust your inference budget.

Control: Enforce rate limits, token caps, timeouts, and per-user quotas. Bound agent loop depth, monitor spend in real time, and alert on anomalous usage patterns before the invoice arrives.

Turning the list into a programme

The Top 10 is most valuable as a coverage map. Build a test case for each entry, run it before every release, and tie the controls to your secure development lifecycle so they cannot regress silently. If you are deploying LLM features and want an independent assessment, our team builds test suites directly against this list. Learn more about how we approach LLM security.

Key takeaways

  • The model treats all context as potentially instructive, so trust boundaries and provenance tagging are foundational.
  • Least privilege applies to tools and agency, not only to data access.
  • Assume system prompts leak and keep all secrets and authorisation logic in code.
  • Every model output is untrusted input to the next system and must be encoded, validated, or confirmed.
  • Bound consumption to defend against denial-of-wallet attacks.