Docs
Background Agents

Background Agents

How Viventium activates deeper thinking without blocking the main conversation.

What Background Agents Are

Background agents are the specialized thinking layer behind the main assistant.

They exist so Viventium can stay responsive in front while still doing slower, broader, or more deliberate work underneath when the task deserves it.

The Two-Phase Model

Background agents follow a two-phase lifecycle:

Phase 1: Activation detection

A fast, lightweight model (currently Groq for speed and cost) evaluates the conversation context against each registered background agent's activation criteria. Each agent has:

  • A confidence threshold (e.g., 0.6) — how confident the system needs to be that this agent would add value
  • A cooldown window (e.g., 45s) — minimum time between activations to prevent noise
  • A history window (e.g., 6 messages) — how much recent context to evaluate

If the threshold is met and the cooldown has passed, the agent activates.

Phase 2: Asynchronous execution and insight merge

The activated agent runs independently with its own model, tools, and instructions. When it finishes, its structured output merges back into the conversation as a follow-up.

The main reply never waits for this. You get speed first, depth second.

The Main Rule

Background agents should improve the answer without degrading the experience.

That means:

  • the first reply should not block on them
  • activation should stay low-noise
  • the follow-up should only appear when there is something worth adding
  • the agent should keep the same capabilities it would have had if run directly

What They Can Help With

  • broader perspective on a decision
  • deeper checking and verification
  • research and synthesis
  • planning and structure
  • tool-backed retrieval from connected systems
  • live web or file work that should happen after the first answer

Context Parity Matters

Background agents should not act like they were dropped into a fresh chat with no memory. They receive:

  • Canonical time context — so they know when "today" is and can reason about deadlines and scheduling
  • Attached file context — documents, spreadsheets, or artifacts from the current conversation
  • Shared user memory — relevant durable facts and working context (when allowed by privacy settings)

Without context parity, background agents would give shallow, disconnected answers. With it, they can reason at the same depth as the main agent but from an independent perspective.

Hold Behavior When Tools Matter

Sometimes a tool-focused background agent is the real source of truth — for example, when checking your inbox, searching the web, or running a calculation.

In those cases, the system uses a Tool Cortex Breathing Hold: the main agent gives a short acknowledgement instead of guessing from memory, then delivers the real answer once the tool-backed background work finishes. This prevents the common AI pattern of confidently making up what it should have looked up.

What Users Should Notice

Good background behavior should feel like:

  • "I got help without losing momentum"
  • "The deeper follow-up was worth it"
  • "The system stayed quiet when it had nothing useful to add"

It should not feel like:

  • random extra chatter
  • hidden delays
  • repeated summaries of the same point
  • fake certainty from one shallow answer

Named Cortices

Background agents are not generic. Each is a registered cortex with its own identity, tools, and activation profile:

  • Red Team cortex — challenges unsupported claims, weak assumptions, and comfort-zone rationalization. Uses web search and sequential thinking. Read more →
  • Research cortex — gathers evidence, compares sources, synthesizes findings
  • Planning cortex — structures multi-step plans and identifies gaps
  • Workspace cortex — retrieves context from connected systems (inbox, calendar, files)

New cortices can be added through the agent configuration without changing the core system.

Background Agents vs Workers

Background agents mostly think, check, retrieve, compare, and prepare. They run during a conversation and their insights merge back into the same thread.

Workers mostly execute inside durable, sandboxed environments tied to projects. They can run for hours, produce artifacts, and be paused or taken over.

That separation matters: a lot of value comes from better reasoning before execution even starts.

Keep Reading