ATLVS Technologies · Blog

Streaming Claude in the console: the AI assistant, grounded in your workspace

2026-04-01·7 min read·ATLVS Technologies team

The AI assistant shipped with three design constraints we weren't willing to negotiate: streaming (responses appear token-by-token, not in a 15-second block), grounded (it can only see your org's data, enforced by RLS), and auditable (every conversation persists to Postgres with model, token count, and cost).

We picked Claude because the Sonnet 4.6 → Opus 4.7 range maps cleanly to 'fast and cheap' vs. 'deep reasoning' — and the Anthropic SDK streaming transport is the cleanest we've used.

Streaming via SSE

The route at /api/v1/ai/chat holds open an SSE connection and forwards Anthropic's streaming deltas straight to the client. Perceived latency on a short prompt is under 500ms. On a multi-thousand-token response, the first token still appears in that window — the remaining tokens arrive as they're generated.

Persistence: ai_conversations and ai_messages

Every conversation lives in two RLS-scoped tables. Switching between conversations is instant. Conversation forking is on the roadmap.

create table ai_conversations (
  id uuid primary key default gen_random_uuid(),
  org_id uuid not null references orgs(id),
  user_id uuid not null references auth.users(id),
  title text,
  model text not null default 'claude-sonnet-4-6',
  created_at timestamptz not null default now()
);
-- RLS: select/insert/update/delete where is_org_member(org_id)

Grounded in your org, not leaked across tenants

The assistant can call tools that read production data. Every one of those tools runs under the requesting user's session. RLS applies. Even if the model tried to return another org's data, Postgres would return zero rows. This is not 'fine-tuning our prompt to be careful.' It's the database saying no.

Drafting templates

The common workflows don't need a chat — they need a form. We shipped drafting templates for advancing emails, vendor RFPs, incident summary reports, and production schedules. Pick a project, click the template, and Claude drafts from the actual data: show dates, crew, vendors, schedule.

Rate limits + costs

/api/v1/ai/* is behind the ai rate bucket in middleware — no runaway costs, no abuse. Every message writes to audit_log with the model, token count, and estimated cost. Professional includes 200K tokens/month; Enterprise is custom.

What it's not

It's not a replacement for your producer. It gives the producer you already have more reach. Use it to draft, summarize, surface, reconcile — not to decide. The AI will cheerfully hallucinate a vendor contact if you let it. Always check.

Model switching

Toggle Sonnet 4.6 (fast, cheap, great at draft + classify) vs. Opus 4.7 (deep reasoning, proposal drafting, contract review) per conversation. Opus is Enterprise-only today; it'll open up to Professional in a future release as price comes down.

Claude AI productionAI for event productionstreaming AI assistantAnthropic SDKRLS-scoped AI

← All posts

Streaming Claude in the console: the AI assistant, grounded in your workspace

2026-04-01·7 min read·ATLVS Technologies team

We picked Claude because the Sonnet 4.6 → Opus 4.7 range maps cleanly to 'fast and cheap' vs. 'deep reasoning' — and the Anthropic SDK streaming transport is the cleanest we've used.

Streaming via SSE

Persistence: ai_conversations and ai_messages

Every conversation lives in two RLS-scoped tables. Switching between conversations is instant. Conversation forking is on the roadmap.

create table ai_conversations (
  id uuid primary key default gen_random_uuid(),
  org_id uuid not null references orgs(id),
  user_id uuid not null references auth.users(id),
  title text,
  model text not null default 'claude-sonnet-4-6',
  created_at timestamptz not null default now()
);
-- RLS: select/insert/update/delete where is_org_member(org_id)

Grounded in your org, not leaked across tenants

Drafting templates

Rate limits + costs

What it's not

Model switching

Claude AI productionAI for event productionstreaming AI assistantAnthropic SDKRLS-scoped AI

Streaming Claude in the console: the AI assistant, grounded in your workspace

Streaming via SSE

Persistence: ai_conversations and ai_messages

Grounded in your org, not leaked across tenants

Drafting templates

Rate limits + costs

What it's not

Model switching

Run Your Next Show on ATLVS Technologies

Streaming Claude in the console: the AI assistant, grounded in your workspace

Streaming via SSE

Persistence: ai_conversations and ai_messages

Grounded in your org, not leaked across tenants

Drafting templates

Rate limits + costs

What it's not

Model switching

Run Your Next Show on ATLVS Technologies