R

Reasoning Model

A class of large language model trained to spend hidden internal "thinking" tokens before producing a user-facing answer — often dramatically improving performance on math, code, science, and complex multi-step problems compared to non-reasoning models of similar size.

What it is

A reasoning model is an LLM that is post-trained (usually with reinforcement learning) to emit a structured internal reasoning trace before its final answer. The user typically does not see the trace; what they see is a higher-quality response that took longer to produce. Examples include OpenAI's o1 and o3 family, Anthropic's extended-thinking modes (in Claude Opus 4.x and Sonnet 4.x), DeepSeek R1, Qwen QwQ, Google Gemini Deep Think, and xAI's reasoning variants. Reasoning models are usually slower per query and more expensive per answer than non-reasoning models of similar parameter count, but they unlock measurably better performance on benchmarks involving multi-step deduction (AIME math, GPQA science, SWE-bench code, frontier ARC-AGI puzzles). They are the most prominent commercial application of the broader "test-time compute" principle.

Why it matters

Reasoning models inverted the assumption that "answer in one shot" was the only inference pattern. For complex work — debugging code across files, writing a multi-section research report, planning a multi-step agent workflow — they produce qualitatively different output than fast chat models. For routine work — extracting fields from a document, classifying support tickets, simple Q&A — they're slower and pricier without delivering benefit. The operational decision in 2026 is no longer "which one model do we use" but "which workloads warrant a reasoning model and which don't." That decision is the routing layer's job; the substrate that captures cost-per-task and quality-per-task lets that layer learn rather than guess.

Key components

  • Internal reasoning traces — structured thinking tokens emitted before the user-facing answer
  • RL post-training — the technique that produces reasoning behavior from a base model
  • Benchmark uplift — disproportionate gains on math, code, science, multi-step deduction
  • Cost and latency tradeoff — slower and pricier per answer, but higher quality on the right workloads
  • Routing implications — workload classification becomes a first-class operational concern

How it connects

In Agentforce, reasoning models are the right choice when an AI agent needs to plan across multiple steps — such as researching an account, drafting a proposal, and scheduling follow-ups in a single run — rather than answering a single, bounded question. Salesforce's Model Selector and Bring Your Own Model capabilities let you point specific agent actions at a reasoning model while keeping faster, cheaper models in place for high-volume routine tasks.

Good to know

Reasoning models can take noticeably longer to respond — sometimes 30 seconds or more for complex problems — which matters if you're surfacing results inside a live Salesforce page or a customer-facing flow where wait time affects experience. Build your agent workflows with that latency in mind, or use them in background jobs rather than real-time interactions.

Need Help Implementing This?

We specialize in putting AI and Agentforce to work for Salesforce customers. Let's talk about your use case.

Book Intro Call