Skip to content

Chat

Desktop's chat is a unified surface across three host modes: human, AI, and agent. It is where a user talks to other people, talks to a generic AI assistant, and talks to a Nimi agent. Same UI shell, three different conversation shapes.

Three Host Modes

ModeWho you're talking toAuthority
HumanAnother userRealm chat thread
AIA generic AI assistantRuntime via SDK
AgentA specific Nimi agentRuntime + ConversationAnchor

The mode determines what the chat shell shows: target rail (who), canonical conversation shell, transcript, composer.

Realtime Delivery

Live chat events sync via Socket.IO. New messages, typing indicators, presence, read state — all delivered as realtime events rather than polled. The realtime path is admitted; chat does not invent its own protocol.

Streaming Chat

When the chat target is AI or agent, the assistant message streams from Runtime under the streaming contract.

PropertyValue
ModeMode A (text/voice with explicit done=true terminal frame)
Bubble renderingIncremental as chunks arrive
Mid-stream stopAvailable during streaming
Partial contentPreserved on interrupt
BackpressureEnd-to-end via SDK

A user who clicks "stop" mid-stream gets the partial reply preserved; the next interaction starts cleanly.

Turn Lifecycle Hook Points

Desktop chat exposes admitted hook points so mods can react at each phase of a turn:

Hook pointFires
pre-policyBefore policy decisions are applied
pre-modelBefore the model call happens
post-stateAfter state has been updated
pre-commitBefore the commit lands

Mods that register against these hooks (under their admitted allowlist) get typed events. Free-form interception is not admitted; the hook points are what the mod surface exposes.

Reader Scenario: Talking To An Agent

You open chat, target your agent, and start typing.

  1. Target rail. You select your agent as the chat target. The conversation shell resolves the ConversationAnchor for (your_agent_id, this_conversation_id).
  2. Compose. You type. The composer shows typed input shape.
  3. Send. The turn is submitted. Runtime's RuntimeAgentService accepts the turn under the agent's Chat Track.
  4. Stream begins. The assistant bubble shows incremental content as Mode A chunks arrive.
  5. Mid-stream stop. You decide to stop early. The streaming contract preserves the partial reply.
  6. Realm chat thread. The turn is recorded in the canonical chat thread — Realm R-CHAT-*.

The agent's identity is canonical Realm truth; the conversation continuity is the runtime-owned anchor; the streaming behavior is admitted contract; the thread is canonical chat history.

Reader Scenario: Group Chat With An Agent Slot

You are in a Realm group chat with humans and an agent slot.

  1. Group thread. Realm R-CHAT-* admits GROUP substrate.
  2. Agent author validation. When the agent posts, Realm validates the agent slot binding. Anti-spoof check happens before the message commits.
  3. Members see typed agent author. Humans cannot impersonate the agent; the agent cannot post outside its admitted slot.
  4. Streaming for the agent message. The agent's reply streams into the group thread.

The anti-spoof check is at the protocol level. A malicious actor who tried to author messages "from" the agent without slot binding fails closed.

What Desktop Chat Does Not Do

ConcernOwned by
Embodiment / avatar visualsAvatar app — Desktop chat is no longer a Live2D / VRM carrier
Memory authorityCognition + Runtime memory bank scopes
Canonical thread truthRealm chat
Turn execution authorityRuntime agent service
Streaming semanticsRuntime streaming contract

If the user wants embodiment, they go to Avatar. Desktop chat may show non-carrier presentation projection (e.g., expression indicator) but the chat surface is no longer the Live2D/VRM carrier.

Source Basis

Nimi AI open world platform documentation.