Docs
Voice, Chat & Messaging

Voice, Chat & Messaging

One continuity layer across the chat web app, real-time voice, Wing Mode, Listen-Only Mode, Telegram text, and Telegram voice.

One System, Different Surfaces

Viventium should not feel like a different assistant every time you switch surfaces.

The same cognitive system is meant to show up across:

  • the chat web app for longer visible reasoning
  • real-time voice calls for spoken conversation
  • Wing Mode inside a live voice call when you want a quieter companion
  • Listen-Only Mode when you want transcription without replies
  • Telegram for mobile text, voice notes, voice replies, reminders, and worker callbacks
  • scheduled delivery when a result should come to you later

Chat Web App

The chat web app is Viventium's main desktop surface.

Use it when you need:

  • longer reasoning
  • visible structure
  • connected accounts
  • uploaded files and tool calls
  • project review
  • drafts, plans, and artifacts you want to inspect closely
  • background-agent follow-through after the first answer

Chat is where the full shape of the system is easiest to see: memory, tools, agents, connected workspaces, scheduling, and worker handoffs all meet in one visible conversation.

Voice Gateway

Voice matters because a lot of important thinking happens faster out loud than through typing.

Viventium's voice surface is built around the Voice Gateway so spoken turns can use the standard Viventium agent pipeline. The goal is not a separate "voice bot." The goal is the same assistant, same continuity, and same follow-through in a real-time spoken surface.

What users should notice:

  • Natural interruption - you can interrupt mid-sentence like a real conversation
  • Shared continuity - voice can use the same memory, background agents, and context story as chat
  • Provider choice - local and hosted speech routes are explicit choices
  • Speech-safe output - the voice layer strips or cleans raw URLs, markdown links, citations, code fences, tables, unknown tags, and punctuation fragments before they become spoken audio
  • Provider-aware expression - expressive markers are kept only for providers that support them

For builders: the current implementation uses a real-time media layer under the Voice Gateway, while the chat web app is built on Viventium's web-chat fork. Those implementation names are useful when debugging or contributing, but the user-facing promise is the Viventium voice and chat experience.

Wing Mode

Wing Mode is a live-call companion mode.

It is for situations where you want Viventium present in the voice call, but not constantly talking. The assistant defaults to silence unless it is clearly addressed, genuinely useful, or there is an urgent reason to speak.

In user terms: Wing Mode is the difference between "answer every sound" and "be a thoughtful partner in the room."

It is not a generic always-on background microphone. It is a call-session state for the voice surface, and it is mutually exclusive with Listen-Only Mode.

Listen-Only Mode

Listen-Only Mode is for transcription and capture without assistant participation.

When it is active, Viventium transcribes and saves the ambient voice record but does not:

  • reply in the moment
  • call tools
  • activate background agents
  • write immediate memory
  • inject the transcript into normal recall or prompt history as a normal chat turn

That is useful for meetings, brainstorming, or context gathering where you want a record first and assistance later.

Expressive Speech

Different voice providers support different kinds of expression.

  • Cartesia Sonic-3 can preserve supported emotion controls and the documented [laughter] marker, so expressive speech can come through when the selected voice route supports it.
  • xAI voice uses its own speech-style controls, such as pauses or vocal delivery cues, and should not be mixed with Cartesia emotion tags.
  • Local Chatterbox keeps voice local, but does not use Cartesia-style emotion tags.
  • Fallback providers strip unsupported markup so users hear clean speech instead of raw control text.

The important user benefit is simple: Viventium tries to make spoken output sound intentional without letting provider-specific markup leak into the conversation.

Telegram

Telegram is the mobile continuity surface.

It is for:

  • quick text check-ins
  • voice notes from your phone
  • voice replies back to you
  • scheduled briefings and reminders
  • follow-up while away from your desk
  • GlassHive worker callbacks and approval moments
  • background-agent follow-through on mobile

Telegram should not feel like a separate lightweight bot. It should feel like the same Viventium meeting you on the surface you already have open.

Which Surface Fits Which Job

SurfaceBest for
Chat web applong-form reasoning, plans, drafts, project review, files, tool calls
Voicefast thinking out loud, live decisions, walking through ideas
Wing Modequiet companion presence inside a live voice call
Listen-Only Modetranscription and capture without response or tool use
Telegrammobile continuity, voice notes, voice replies, reminders, briefings

The Real Promise

The important promise is not "many channels."

The important promise is:

  • one memory story
  • one background-intelligence story
  • one project and follow-through story
  • one system that can meet you where you already are

Keep Reading