C

Cascading Hallucinations

When one agent's hallucinated output is consumed as truth by the next agent in a workflow — quietly spreading bad data across systems and decisions before any human sees it.

What it is

A single hallucinated fact in a chat reply is usually visible — the user reads it and either spots it or does not. Cascading hallucinations are different: they happen when agents pass outputs to other agents, write outputs into databases that other agents query, or post outputs into channels where other agents listen. The first agent invents a fact (a wrong customer name, a fabricated SKU, a non-existent policy). The second agent reads it from the database or message thread, treats it as authoritative, and acts on it. By the time a human sees a problem, the lie is in three systems and has been built on top of for two weeks.

Why it matters

Multi-agent systems are the architecture trend of 2026 — orchestrators delegating to specialists, agent-to-agent protocols, shared memory pools. Each handoff is also a moment where one agent's hallucination can become another agent's premise. The hardest cases are where the hallucination is plausible — wrong but in-distribution — because no validation step rejects it. Mitigations include grounding outputs in citations the next agent can verify, separating "draft" from "trusted" in shared stores, tagging outputs with provenance, and regularly auditing data origin.

Key components

  • Agent-to-agent propagation — one specialist's output becomes another's input
  • Memory-mediated propagation — hallucinations written to shared memory propagate forever
  • Plausibility risk — wrong outputs that look right are most dangerous
  • Loss of provenance — by the third hop, no one knows where the fact originated
  • Mitigation — provenance tags, grounding requirements, draft/trusted separation

Need Help Implementing This?

We specialize in putting AI and Agentforce to work for Salesforce customers. Let's talk about your use case.

Book Intro Call