Kevin Gomes

Software Engineer / Songwriter

Specializing in UI/UX, REST APIs, Platform Engineering, Agentic AI and Test Automation.


Back to Architectures

Streaming AI Chat

Overview

A real-time conversational AI interface embedded in the portfolio site, acting as a music portfolio assistant for kevingomesmusic.com. The chatbot uses Anthropic's Claude API with streaming responses to deliver a responsive, engaging user experience while showcasing AI integration capabilities.

Architectural Goals

  • Real-Time Streaming: Deliver token-by-token responses for a natural conversational feel, eliminating wait times for full response generation.
  • Knowledge-Grounded Responses: Constrain the AI to a curated knowledge base, ensuring accurate and relevant answers about Kevin's music portfolio.
  • Seamless Integration: Match the existing site's dark theme and design language while introducing an interactive AI-powered feature.
  • Cost Control: Limit token output and scope the model's behavior to prevent excessive API usage.

System Architecture

┌──────────────────────────────────────────────────┐
│                  Client (Browser)                │
│                                                  │
│  ┌────────────────────────────────────────────┐  │
│  │          Chat Page (React Client)          │  │
│  │                                            │  │
│  │  useChat() ──→ manages messages & status   │  │
│  │  sendMessage() ──→ POST /api/chat          │  │
│  │  Streaming SSE ←── token-by-token render   │  │
│  │  ReactMarkdown ──→ rich content + embeds   │  │
│  └────────────────────────────────────────────┘  │
│                       │                          │
└───────────────────────│──────────────────────────┘
                        │ HTTP POST (SSE response)
┌───────────────────────│──────────────────────────┐
│               Next.js API Route                  │
│                                                  │
│  ┌────────────────────────────────────────────┐  │
│  │            POST /api/chat                  │  │
│  │                                            │  │
│  │  1. Parse incoming UIMessages              │  │
│  │  2. Convert to model messages              │  │
│  │  3. Attach system prompt + knowledge base  │  │
│  │  4. Stream response via Anthropic SDK      │  │
│  └────────────────────────────────────────────┘  │
│                       │                          │
└───────────────────────│──────────────────────────┘
                        │ API Call (streaming)
┌───────────────────────│──────────────────────────┐
│              Anthropic Claude API                │
│                                                  │
│  Model: claude-sonnet-4-20250514                 │
│  Max Output Tokens: 1024                         │
│  System Prompt: Knowledge base + behavior rules  │
└──────────────────────────────────────────────────┘

Core Components

1. API Route (/api/chat)

The server-side handler receives conversation messages, converts them from the UI message format to model-compatible messages, injects the system prompt containing the full knowledge base, and streams the response back using Server-Sent Events (SSE).

  • Message Conversion: Uses convertToModelMessages() to transform the client's UIMessage format (parts-based) into the standard model message format (content-based) expected by streamText.
  • Streaming: streamText() initiates a streaming call to the Anthropic API, and toUIMessageStreamResponse() converts it into an SSE response compatible with the useChat client hook.

2. Knowledge Base & System Prompt

A dedicated module exports the SYSTEM_PROMPT containing:

  • Behavioral instructions: Friendly tone, music-only scope, markdown link formatting for YouTube URLs.
  • Full discography: Node Music EP (3 songs), Another Day, Blue, Not Forsaken, H2O, Heroes Unseen — each with dates, descriptions, teams, and YouTube links.
  • Scope boundaries: The AI politely declines questions outside Kevin's music portfolio.

3. Chat Page (Client Component)

A use client React component built with the Vercel AI SDK's useChat hook:

  • State Management: The useChat hook manages the full message lifecycle — sending, streaming status, and message history.
  • Message Rendering: User messages are right-aligned (teal), assistant messages are left-aligned (slate) with markdown rendering via react-markdown.
  • YouTube Embeds: Custom react-markdown components detect YouTube URLs (both markdown links and bare URLs) and render inline <iframe> video players.
  • Starter Prompts: Pre-defined prompt buttons appear when the chat is empty, guiding users into the conversation.
  • Reset: A reset button clears the conversation history and returns to the starter prompts.
  • Loading State: A pulsing dots indicator appears while waiting for the stream to begin.

4. YouTube Embed System

A multi-layer approach ensures YouTube links always render as embedded players:

  • Link Component: Intercepts <a> tags rendered by react-markdown, extracts YouTube video IDs, and replaces them with <iframe> embeds.
  • Text Scanner: Custom <p> and <li> components scan text children for bare YouTube URLs (not wrapped in markdown link syntax) and replace them with embeds.
  • Video ID Extraction: A regex pattern handles youtu.be, youtube.com/watch?v=, youtube.com/embed/, and youtube.com/v/ URL formats.

Data Flow

  1. User types a message or clicks a starter prompt
  2. useChat sends a POST request to /api/chat with the full conversation history
  3. The API route converts UI messages to model messages and calls streamText with the system prompt
  4. Anthropic's API streams tokens back via SSE
  5. useChat updates the message list in real-time as tokens arrive
  6. react-markdown renders the response with custom components (YouTube embeds, styled links)
  7. The chat auto-scrolls to the latest message

Key Libraries

  • AI SDK (ai): Vercel's AI SDK for streamText, message conversion, and streaming response utilities.
  • @ai-sdk/anthropic: Anthropic provider for the AI SDK, connecting to Claude models.
  • @ai-sdk/react: React hooks (useChat) for managing chat state and streaming on the client.
  • react-markdown: Renders AI responses as rich HTML with custom component overrides.

Security Considerations

  • API Key: The ANTHROPIC_API_KEY is stored in .env.local (server-side only), never exposed to the client.
  • Server-Side Execution: All AI API calls happen in the Next.js API route, keeping the Anthropic SDK and credentials on the server.
  • Token Limiting: maxOutputTokens: 1024 prevents runaway responses and controls costs.
  • Scoped Knowledge: The system prompt constrains responses to the music portfolio, reducing risk of generating inappropriate or off-topic content.

Summary

The Streaming AI Chat architecture demonstrates a production-ready pattern for integrating conversational AI into a Next.js application. By combining server-side streaming, a knowledge-grounded system prompt, and rich client-side rendering with YouTube embeds, it delivers an engaging user experience while maintaining security and cost control.