LlamaIndex Adapter
LlamaIndex is a framework for building LLM-powered applications. LlamaIndex helps you ingest, structure, and access private or domain-specific data. LlamaIndex.TS offers the core features of LlamaIndex for Python for popular runtimes like Node.js (official support), Vercel Edge Functions (experimental), and Deno (experimental).
Installation
Tab Title
Tab Title
Tab Title
llamaindex is a required peer dependency.
Features
- Transform LlamaIndex ChatEngine and QueryEngine streams to AI SDK
UIMessageStream
- Seamless integration with AI SDK UI components like
useCompletion
- Support for RAG (Retrieval Augmented Generation) workflows
- Compatible with LlamaIndex’s document processing and indexing capabilities
Example: Completion
Here is a basic example that uses both AI SDK and LlamaIndex together with the Next.js App Router.
The AI SDK @ai-sdk/llamaindex package uses the stream result from calling the chat method on a LlamaIndex ChatEngine or the query method on a LlamaIndex QueryEngine to pipe text to the client.
import { OpenAI, SimpleChatEngine } from 'llamaindex';
import { toUIMessageStream } from '@ai-sdk/llamaindex';
import { createUIMessageStreamResponse } from 'ai';
export const maxDuration = 60;
export async function POST(req: Request) {
const { prompt } = await req.json();
const llm = new OpenAI({ model: 'gpt-4o' });
const chatEngine = new SimpleChatEngine({ llm });
const stream = await chatEngine.chat({
message: prompt,
stream: true,
});
return createUIMessageStreamResponse({
stream: toUIMessageStream(stream),
});
}
Then, we use the AI SDK’s useCompletion method in the page component to handle the completion:
'use client';
import { useCompletion } from '@ai-sdk/react';
export default function Chat() {
const { completion, input, handleInputChange, handleSubmit } =
useCompletion();
return (
<div>
{completion}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
</form>
</div>
);
}
Example: RAG with QueryEngine
LlamaIndex excels at building RAG applications. Here’s an example using a QueryEngine with document indexing:
import {
OpenAI,
VectorStoreIndex,
SimpleDirectoryReader,
} from 'llamaindex';
import { toUIMessageStream } from '@ai-sdk/llamaindex';
import { createUIMessageStreamResponse } from 'ai';
export const maxDuration = 60;
// Initialize once (consider caching in production)
let queryEngine: any = null;
async function getQueryEngine() {
if (!queryEngine) {
// Load documents from a directory
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData('./data');
// Create index from documents
const index = await VectorStoreIndex.fromDocuments(documents);
// Create query engine
queryEngine = index.asQueryEngine({
llm: new OpenAI({ model: 'gpt-4o' }),
});
}
return queryEngine;
}
export async function POST(req: Request) {
const { prompt } = await req.json();
const engine = await getQueryEngine();
const stream = await engine.query({
query: prompt,
stream: true,
});
return createUIMessageStreamResponse({
stream: toUIMessageStream(stream),
});
}
Example: Chat with Context
Build a conversational interface with document context:
import {
OpenAI,
VectorStoreIndex,
ContextChatEngine,
SimpleDirectoryReader,
} from 'llamaindex';
import { toUIMessageStream } from '@ai-sdk/llamaindex';
import { createUIMessageStreamResponse } from 'ai';
export const maxDuration = 60;
let chatEngine: any = null;
async function getChatEngine() {
if (!chatEngine) {
const reader = new SimpleDirectoryReader();
const documents = await reader.loadData('./data');
const index = await VectorStoreIndex.fromDocuments(documents);
// Create a chat engine with context
chatEngine = new ContextChatEngine({
retriever: index.asRetriever(),
llm: new OpenAI({ model: 'gpt-4o' }),
});
}
return chatEngine;
}
export async function POST(req: Request) {
const { prompt } = await req.json();
const engine = await getChatEngine();
const stream = await engine.chat({
message: prompt,
stream: true,
});
return createUIMessageStreamResponse({
stream: toUIMessageStream(stream),
});
}
Use with the useCompletion hook on the client:
'use client';
import { useCompletion } from '@ai-sdk/react';
export default function ChatWithContext() {
const { completion, input, handleInputChange, handleSubmit, isLoading } =
useCompletion({
api: '/api/chat',
});
return (
<div>
<div className="response">
{completion || 'Ask a question about your documents...'}
</div>
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={handleInputChange}
disabled={isLoading}
placeholder="What would you like to know?"
/>
<button type="submit" disabled={isLoading}>
{isLoading ? 'Thinking...' : 'Ask'}
</button>
</form>
</div>
);
}
API Reference
toUIMessageStream(stream)
Converts a LlamaIndex ChatEngine or QueryEngine stream to an AI SDK UIMessageStream.
import { toUIMessageStream } from '@ai-sdk/llamaindex';
import { createUIMessageStreamResponse } from 'ai';
const stream = await chatEngine.chat({
message: prompt,
stream: true,
});
return createUIMessageStreamResponse({
stream: toUIMessageStream(stream),
});
Parameters:
stream: AsyncIterable - Stream from LlamaIndex ChatEngine or QueryEngine
Returns: ReadableStream<UIMessageChunk>
Integration with LlamaIndex Features
The adapter works seamlessly with LlamaIndex’s powerful features:
Document Loaders
- Load documents from various sources (files, URLs, databases)
- Support for multiple file formats (PDF, Markdown, JSON, etc.)
- Custom document readers
Vector Stores
- In-memory vector storage
- Integration with external vector databases
- Efficient similarity search
Retrievers
- Vector similarity retrieval
- Keyword-based retrieval
- Hybrid retrieval strategies
Query Engines
- Simple query engine for basic RAG
- Sub-question query engine for complex queries
- Custom query engines
Chat Engines
- Simple chat engine
- Context chat engine with retrieval
- Condense question chat engine
More Examples
create-llama is the easiest way to get started with LlamaIndex. It uses the AI SDK to connect to LlamaIndex in all its generated code.
Learn More