Skip to main content

Streaming Chat Agent

Real-time commerce chat with token-by-token streaming. Next.js App Router + Vercel AI SDK. Tool calls appear in the UI as they happen.

1 min read · updated
View MarkdownEdit on GitHub
TIME
~15 min
STACK
Next.js 15Vercel AI SDK
SERVERS
stripeasaascorreios

Real-time commerce chat with token-by-token streaming. Next.js App Router + Vercel AI SDK. Tool calls appear in the UI as they happen — the canonical customer-facing chat pattern.

STREAMING ARCHITECTURE
Tokens flow frontend ⇄ backend in real time
Frontend · app/page.tsx
useChat()
What's the Correios rate to 01310-100?
calling codespar_ship…
O frete SEDEX para 01310-100 é R$ 18,50, prazo…
HTTP POST
SSE stream
Backend · app/api/chat/route.ts
streamText()
22:14:08INFOPOST /api/chat
22:14:08INFOsession created → sb_9f7c
22:14:08TOOLcodespar_ship → correios
22:14:10DONErates fetched · 2.1s
22:14:10INFOstreaming response…
useChat sends messages → streamText runs agent loop → tools execute automatically → tokens stream back to UI

Prerequisites

npm install @codespar/sdk @codespar/vercel ai @ai-sdk/openai

Backend route

The API route creates a session per request, gets tools, and returns a streaming response. The Vercel adapter handles tool execution automatically — no manual loop.

app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { CodeSpar } from "@codespar/sdk";
import { getTools } from "@codespar/vercel";

const codespar = new CodeSpar({ apiKey: process.env.CODESPAR_API_KEY! });

export async function POST(req: Request) {
  const { messages } = await req.json();

  const session = await codespar.create("user_123", {
    servers: ["stripe", "asaas", "correios"],
  });

  const tools = await getTools(session);

  const result = streamText({
    model: openai("gpt-4o"),
    tools,
    maxSteps: 10,
    system: `Commerce assistant for a Brazilian store. Be concise.`,
    messages,
    onFinish: async () => { await session.close(); },
  });

  return result.toDataStreamResponse();
}

The Vercel adapter's getTools returns tools with built-in execute methods. Tool calls happen automatically as the LLM requests them — no manual loop needed.

Frontend component

app/page.tsx
"use client";

import { useChat } from "@ai-sdk/react";

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } =
    useChat({ api: "/api/chat" });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>{m.content}</div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
      </form>
    </div>
  );
}

How it works

  1. User types a message, useChat sends it to /api/chat
  2. Backend creates a CodeSpar session with the servers you want available
  3. streamText starts streaming the LLM response token-by-token
  4. When the LLM decides to call a tool, Vercel AI SDK executes it via session.execute() automatically
  5. Tool results feed back into the stream, the LLM continues generating
  6. Session closes when the response finishes — via onFinish

Adding tool call indicators

Show the user when tools are being called — dramatically improves perceived latency and trust.

app/page.tsx (enhanced)
{messages.map((m) => (
  <div key={m.id}>
    {m.content && <p>{m.content}</p>}
    {m.toolInvocations?.map((tool) => (
      <div key={tool.toolCallId}>
        {tool.state === "result"
          ? `✓ ${tool.toolName} completed`
          : `⏳ Calling ${tool.toolName}...`}
      </div>
    ))}
  </div>
))}

Streaming vs status streams

Two different streaming surfaces, often confused:

  • LLM token streaming (this cookbook) — streamText from the Vercel AI SDK, or session.sendStream(prompt) from the SDK directly. Streams chat completion tokens token-by-token as the model generates them.
  • Status streamssession.paymentStatusStream(toolCallId) and session.verificationStatusStream(toolCallId). SSE endpoints under /v1/tool-calls/:id/<thing>/stream that push settlement / KYC outcome events as the underlying provider webhooks fire. Different layer, different lifetime. See /docs/concepts/sse-streaming.

Production considerations

  • Session per request. Each /api/chat call creates a new session. For multi-turn conversations where you want to reuse a session, store the session ID client-side and reopen it.
  • Error handling. Wrap session.close() in onFinish — it runs on both success and error paths.
  • Rate limiting. Add rate limiting to the API route to prevent abuse.
  • Authentication. Use Clerk or NextAuth to authenticate users before creating sessions. Pass the real userId as the first argument to codespar.create(userId, config) for billing attribution.

Next steps

Edit on GitHub

Last updated on