What it is
Guardrails are the safety constraints configured on Agentforce agents. They include topic restrictions (what the agent can and cannot discuss), action permissions (what operations it can perform), escalation rules (when to hand off to a human), and output filters (what it cannot say). Guardrails are enforced by the Atlas Reasoning Engine before any action is taken.
Why it matters
Without guardrails, AI agents are unpredictable. With them, agents operate within defined boundaries — which is what enterprise buyers, compliance teams, and regulators require. Guardrails are not limitations; they are what make AI deployable in production.
Key components
- Topic restrictions
- Action permissions
- Escalation rules
- Output filters
- PII masking
How it connects
You configure guardrails in Agent Builder as part of topic and action definitions. The Trust Layer adds additional guardrails at the platform level.
Good to know
Guardrails should be tested with adversarial prompts — try to make the agent break its own rules. If you can break it in testing, customers will break it in production.
Related terms
Agentforce
Salesforce's AI agent platform that enables businesses to build, customize, and deploy autonomous AI agents across sales, service, marketing, and commerce.
Human-in-the-Loop (HITL)
A design pattern where AI systems require human approval or intervention at critical decision points before taking action.
Agent Governance
The policies, controls, and monitoring systems that ensure AI agents operate safely, compliantly, and within business-approved boundaries.
Trust Layer
Salesforce's AI trust and safety framework that ensures agents operate within compliance boundaries, prevent misuse, and maintain data security.
