Guides

Building Next.js with open-source tools

This guide details the implementation of a production-ready AI chat interface in Next.js. You will learn how to configure streaming responses using the Vercel AI SDK, secure your API keys via Server Components, and implement rate limiting to manage costs and prevent abuse.

35 minutes5 steps
1

Environment and Dependency Configuration

Install the necessary SDKs for AI streaming and environment management. Store your OpenAI and Upstash credentials in a .env.local file to ensure they are never exposed to the client-side bundle.

.env.local
npm install ai openai @upstash/ratelimit @upstash/redis

⚠ Common Pitfalls

  • Accidentally prefixing AI keys with NEXT_PUBLIC_, which exposes them to the browser.
2

Create a Streaming Route Handler

Define a POST route in the App Router that initializes the OpenAI client and uses the Vercel AI SDK to stream the response. This route must reside in the api directory to leverage the Edge Runtime for lower latency.

app/api/chat/route.ts
import OpenAI from 'openai';
import { OpenAIStream, StreamingTextResponse } from 'ai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
export const runtime = 'edge';
export async function POST(req: Request) {
  const { messages } = await req.json();
  const response = await openai.chat.completions.create({
    model: 'gpt-3.5-turbo',
    stream: true,
    messages,
  });
  const stream = OpenAIStream(response);
  return new StreamingTextResponse(stream);
}

⚠ Common Pitfalls

  • Exceeding the default 10s execution limit on Vercel hobby plans; use the 'edge' runtime to bypass this.
3

Implement Rate Limiting Middleware

Protect your AI endpoint from automated abuse by implementing a rate limit using Upstash Redis. This logic should be placed directly in the route handler or a dedicated middleware file to check the identifier (IP address) before processing the AI request.

app/api/chat/route.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(5, '60 s'),
});
// Inside POST handler:
const ip = req.headers.get('x-forwarded-for') ?? '127.0.0.1';
const { success } = await ratelimit.limit(ip);
if (!success) return new Response('Too Many Requests', { status: 429 });

⚠ Common Pitfalls

  • Hardcoding the IP identifier instead of using request headers, which causes all users to share the same limit.
4

Develop the Chat Interface Component

Use the useChat hook from the AI SDK to manage the chat state, input handling, and automatic message rendering. This hook abstracts the complexity of manual fetch calls and stream parsing.

components/Chat.tsx
'use client';
import { useChat } from 'ai/react';
export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat();
  return (
    <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
      {messages.map(m => (
        <div key={m.id} className="whitespace-pre-wrap">{m.role === 'user' ? 'User: ' : 'AI: '}{m.content}</div>
      ))}
      <form onSubmit={handleSubmit}>
        <input className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl" value={input} onChange={handleInputChange} />
      </form>
    </div>
  );
}

⚠ Common Pitfalls

  • Forgetting the 'use client' directive, which causes the hook to fail in Server Components.
5

Configure Edge Runtime and Metadata

Ensure the page hosting the chat is optimized for performance by setting the runtime to edge if it relies heavily on the AI stream. Also, define SEO metadata to ensure the chat tool is discoverable.

app/chat/page.tsx
export const runtime = 'edge';
export const metadata = {
  title: 'AI Assistant - Next.js',
  description: 'Real-time AI chat powered by Next.js and OpenAI',
};

⚠ Common Pitfalls

  • Inconsistent runtime configurations between the API route and the page, leading to unexpected cold start behaviors.

What you built

You have successfully implemented a secure, rate-limited AI streaming interface. By using the Edge Runtime and the Vercel AI SDK, you ensure low latency and a smooth user experience. Next steps include integrating Auth.js to provide user-specific rate limits and persistent chat history using Prisma or Drizzle.