Skip to main content

Conceptual guide

This section contains introductions to key parts of LangChain.dart

Architecture

LangChain.dart as a framework consists of a number of packages.

langchain_core

This package contains base abstractions of different components and ways to compose them together. The interfaces for core components like LLMs, vector stores, retrievers and more are defined here. No third party integrations are defined here.

Depend on this package to build frameworks on top of .dart.dart or to interoperate with it.

langchain

Contains higher-level and use-case specific chains, agents, and retrieval algorithms that are at the core of the application's cognitive architecture.

Depend on this package to build LLM applications with .dart.dart.

This package exposes langchain_core so you don't need to depend on it explicitly.

langchain_community

Contains third-party integrations and community-contributed components that are not part of the core LangChain.dart API.

Depend on this package if you want to use any of the integrations or components it provides.

Integration-specific packages

Popular third-party integrations (e.g. langchain_openai, langchain_google, langchain_ollama, etc.) are moved to their own packages so that they can be imported independently without depending on the entire langchain_community package.

Depend on an integration-specific package if you want to use the specific integration.

See Integrations to integrate with a specific package.

LangChain Expression Language (LCEL)

LangChain Expression Language, or LCEL, is a declarative way to easily compose chains together. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully run LCEL chains with 100s of steps in production). To highlight a few of the reasons you might want to use LCEL:

  • First-class streaming support: When you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens.
  • Optimized concurrent execution: Whenever your LCEL chains have steps that can be executed concurrently (eg if you fetch documents from multiple retrievers) we automatically do it for the smallest possible latency.
  • Retries and fallbacks: Configure retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale.
  • Access intermediate results: For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used to let end-users know something is happening, or even just to debug your chain.

Runnable interface

To make it as easy as possible to create custom chains, LangChain provides a Runnable interface that most components implement, including chat models, LLMs, output parsers, retrievers, prompt templates, and more.

This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way. The standard interface includes:

  • invoke: call the chain on an input and return the output.
  • stream: call the chain on an input and stream the output.
  • batch: call the chain on a list of inputs and return a list of outputs.

The type of the input and output varies by component:

ComponentInput TypeOutput Type
PromptTemplateMap<String, dynamic>PromptValue
ChatMessagePromptTemplateMap<String, dynamic>PromptValue
LLMPromptValueLLMResult
ChatModelPromptValueChatResult
OutputParserAny objectParser output type
RetrieverStringList<Document>
DocumentTransformerList<Document>List<Document>
ToolMap<String, dynamic>String
ChainMap<String, dynamic>Map<String, dynamic>

Components

Chat models

Language models that use a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text). These are traditionally newer models (older models are generally LLMs, see below). Chat models support the assignment of distinct roles to conversation messages, helping to distinguish messages from the AI, users, and instructions such as system messages.

Although the underlying models are messages in, message out, the LangChain wrappers also allow these models to take a string as input. This means you can easily use chat models in place of LLMs.

When a string is passed in as input, it is converted to a HumanMessage and then passed to the underlying model.

LangChain does not host any Chat Models, rather we rely on third party integrations.

We have some standardized parameters when constructing ChatModels:

  • model: the name of the model
  • temperature: the sampling temperature
  • timeout: request timeout
  • maxTokens: max tokens to generate
  • apiKey: API key for the model provider
  • baseUrl: endpoint to send requests to

Some important things to note:

  • standard params only apply to model providers that expose parameters with the intended functionality. For example, some providers do not expose a configuration for maximum output tokens, so max_tokens can't be supported on these.
  • standard params are currently only enforced on integrations that have their own integration packages (e.g. langchain-openai, langchain-anthropic, etc.), they're not enforced on models in langchain-community.

ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the API reference for that model.

LLMs

caution

Pure text-in/text-out LLMs tend to be older or lower-level. Many popular models are best used as chat completion models, even for non-chat use cases.

You are probably looking for the section above instead.

Language models that takes a string as input and returns a string. These are traditionally older models (newer models generally are Chat Models, see above).

Although the underlying models are string in, string out, the LangChain wrappers also allow these models to take messages as input. This gives them the same interface as Chat Models. When messages are passed in as input, they will be formatted into a string under the hood before being passed to the underlying model.

LangChain.dart does not host any LLMs, rather we rely on third party integrations. See (/docs/integrations)

Messages

Some language models take a list of messages as input and return a message.

LangChain provides several objects to easily distinguish between different roles:

HumanChatMessage

This represents a message from the user.

AIChatMessage

This represents a message from the model.

SystemChatMessage

This represents a system message, which tells the model how to behave. Not every model provider supports this.

FunctionChatMessage / ToolChatMessage

These represent a decision from an language model to call a tool. They're a subclass of a AIChatMessage. FunctionChatMessage is a legacy message type corresponding to OpenAI's legacy function-calling API.

Prompt Templates

Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.

In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it would be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.

PromptTemplates help with exactly this! They bundle up all the logic for going from user input into a fully formatted prompt. This can start off very simple - for example, a prompt to produce the above string would just be:

final prompt = PromptTemplate.fromTemplate(
'What is a good name for a company that makes {product}?',
);
final res = prompt.format({'product': 'colorful socks'});
print(res);
// 'What is a good name for a company that makes colorful socks?'

However, the advantages of using these over raw string formatting are several. You can "partial" out variables - e.g. you can format only some of the variables at a time. You can compose them together, easily combining different templates into a single prompt.

For specifics on how to use prompt templates, see the relevant how-to guides here.

PromptTemplates can also be used to produce a list of messages. In this case, the prompt not only contains information about the content, but also each message (its role, its position in the list, etc) Here, what happens most often is a ChatPromptTemplate is a list of ChatMessagePromptTemplates. Each ChatMessagePromptTemplate contains instructions for how to format that ChatMessage - its role, and then also its content. Let's take a look at this below:

const template = 'You are a helpful assistant that translates {input_language} to {output_language}.';
const humanTemplate = '{text}';

final chatPrompt = ChatPromptTemplate.fromTemplates([
(ChatMessageType.system, template),
(ChatMessageType.human, humanTemplate),
]);

final res = chatPrompt.formatMessages({
'input_language': 'English',
'output_language': 'French',
'text': 'I love programming.',
});
print(res);
// [
// SystemChatMessage(content='You are a helpful assistant that translates English to French.'),
// HumanChatMessage(content='I love programming.')
// ]

ChatPromptTemplates can also be constructed in other ways - For specifics on how to use prompt templates, see the relevant how-to guides here.

Output parsers

note

The information here refers to parsers that take a text output from a model try to parse it into a more structured representation. More and more models are supporting function (or tool) calling, which handles this automatically. It is recommended to use function/tool calling rather than output parsing. See documentation for that here.

OutputParsers convert the raw output of an LLM into a format that can be used downstream. There are few main type of OutputParsers, including:

  • Convert text from LLM -> structured information (e.g. JSON).
  • Convert a ChatMessage into just a string.
  • Convert the extra information returned from a call besides the message (like OpenAI function invocation) into a string.

For full information on this, see the section on output parsers.

Chat history

Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation. At bare minimum, a conversational system should be able to access some window of past messages directly.

The concept of ChatHistory refers to a class in LangChain which can be used to wrap an arbitrary chain. This ChatHistory will keep track of inputs and outputs of the underlying chain, and append them as messages to a message database. Future interactions will then load those messages and pass them into the chain as part of the input.

Documents

A Document object in LangChain contains information about some data. It has two attributes:

  • pageContent: String - The content of this document.
  • metadata: Map<String,dynamic> - Arbitrary metadata associated with this document. Can track the document id, file name, etc.

Document loaders

Use document loaders to load data from a source as Document's. For example, there are document loaders for loading a simple .txt file, for loading the text contents of any web page, or even for loading a transcript of a YouTube video.

Document loaders expose two methods:

  • lazyLoad(): returns a Stream of Document's. This is useful for loading large amounts of data, as it allows you to process each Document as it is loaded, rather than waiting for the entire data set to be loaded in memory.
  • load(): returns a list of Document's. Under the hood, this method calls lazyLoad() and collects the results into a list. Use this method only with small data sets.

The simplest loader reads in a file as text and places it all into one Document.


const filePath = 'example.txt';
const loader = TextLoader(filePath);
final docs = await loader.load();

Text splitters

Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example is you may want to split a long document into smaller chunks that can fit into your model's context window. LangChain has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents.

Text splitters

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together. What "semantically related" means could depend on the type of text. This tutorial showcases several ways to do that.

At a high level, text splitters work as following:

  1. Split the text up into small, semantically meaningful chunks (often sentences).
  2. Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
  3. Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text with some overlap (to keep context between chunks).

That means there are two different axes along which you can customize your text splitter:

  1. How the text is split.
  2. How the chunk size is measured.

The most basic text splitter is the CharacterTextSplitter. This splits based on characters (by default \n\n) and measure chunk length by number of characters.

The default recommended text splitter is the RecursiveCharacterTextSplitter. This text splitter takes a list of characters. It tries to create chunks based on splitting on the first character, but if any chunks are too large it then moves onto the next character, and so forth. By default the characters it tries to split on are ["\n\n", "\n", " ", ""].

In addition to controlling which characters you can split on, you can also control a few other things:

  • lengthFunction: how the length of chunks is calculated. Defaults to just counting number of characters, but it's pretty common to pass a token counter here.
  • chunkSize: the maximum size of your chunks (as measured by the length function).
  • chunkOverlap: the maximum overlap between chunks. It can be nice to have some overlap to maintain some continuity between chunks (eg do a sliding window).
  • addStartIndex: whether to include the starting position of each chunk within the original document in the metadata.
const filePath = 'state_of_the_union.txt';
const loader = TextLoader(filePath);
final documents = await loader.load();
const textSplitter = RecursiveCharacterTextSplitter(
chunkSize: 800,
chunkOverlap: 0,
);
final docs = textSplitter.splitDocuments(documents);

Embedding models

Embedding models create a vector representation of a piece of text. You can think of a vector as an array of numbers that captures the semantic meaning of the text. By representing the text in this way, you can perform mathematical operations that allow you to do things like search for other pieces of text that are most similar in meaning. These natural language search capabilities underpin many of the context retrieval where we provide an LLM with relevant data it needs to effectively respond to a query.

The Embeddings class is a class designed for interfacing with text embedding models. There are lots of embedding model providers (OpenAI, Cohere, Hugging Face, etc) - this class is designed to provide a standard interface for all of them.

Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

The base Embeddings class in LangChain exposes two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).

For specifics on how to use embedding models, see the relevant how-to guides here.

Vector stores

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search for you.

Most vector stores can also store metadata about embedded vectors and support filtering on that metadata before similarity search, allowing you more control over returned documents.

Vector stores can be converted to the retriever interface by doing:

For specifics on how to use vector stores, see the relevant how-to guides here.

Retrievers

A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) it. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

Retrievers accept a string query as input and return a list of Document's as output.

The public API of the BaseRetriever class in LangChain is as follows:

abstract interface class BaseRetriever {
Future<List<Document>> getRelevantDocuments(final String query);
}

For specifics on how to use retrievers, see the relevant how-to guides here.

Tools

Tools are utilities designed to be called by a model. Their inputs are designed to be generated by models, and their outputs are designed to be passed back to models. Tools are needed whenever you want a model to control parts of your code or call out to external APIs. A tool consists of:

  1. The name of the tool.
  2. A description of what the tool does.
  3. A JSON schema defining the inputs to the tool.
  4. A function (and, optionally, an async variant of the function).

When a tool is bound to a model, the name, description and JSON schema are provided as context to the model. Given a list of tools and a set of instructions, a model can request to call one or more tools with specific inputs.

To define a tool in dart, we use the ToolSpec class.

final openaiApiKey = Platform.environment['OPENAI_API_KEY'];
final model = ChatOpenAI(apiKey: openaiApiKey);

final promptTemplate = ChatPromptTemplate.fromTemplate(
'Tell me a joke about {foo}',
);

const tool = ToolSpec(
name: 'joke',
description: 'A joke',
inputJsonSchema: {
'type': 'object',
'properties': {
'setup': {
'type': 'string',
'description': 'The setup for the joke',
},
'punchline': {
'type': 'string',
'description': 'The punchline for the joke',
},
},
'required': ['setup', 'punchline'],
},
);

final chain = promptTemplate |
model.bind(
ChatOpenAIOptions(
tools: const [tool],
toolChoice: ChatToolChoice.forced(name: tool.name),
),
);

final res = await chain.invoke({'foo': 'bears'});
print(res);
// ChatResult{
// id: chatcmpl-9LBPyaZcFMgjmOvkD0JJKAyA4Cihb,
// output: AIChatMessage{
// content: ,
// toolCalls: [
// AIChatMessageToolCall{
// id: call_JIhyfu6jdIXaDHfYzbBwCKdb,
// name: joke,
// argumentsRaw: {"setup":"Why don't bears like fast food?","punchline":"Because they can't catch it!"},
// arguments: {
// setup: Why don't bears like fast food?,
// punchline: Because they can't catch it!
// },
// }
// ],
// },
// finishReason: FinishReason.stop,
// metadata: {
// model: gpt-4o-mini,
// created: 1714835806,
// system_fingerprint: fp_3b956da36b
// },
// usage: LanguageModelUsage{
// promptTokens: 77,
// responseTokens: 24,
// totalTokens: 101
// },
// streaming: false
// }

When designing tools to be used by a model, it is important to keep in mind that:

  • Chat models that have explicit tool-calling APIs will be better at tool calling than non-fine-tuned models.
  • Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas. This another form of prompt engineering.
  • Simple, narrowly scoped tools are easier for models to use than complex tools.

For specifics on how to use tools, see the tools how-to guides.

To use a pre-built tool, see the tool integration docs.

Agents

By themselves, language models can't take actions - they just output text. A big use case for LangChain is creating agents. Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be. The results of those actions can then be fed back into the agent and it determine whether more actions are needed, or whether it is okay to finish.

Callbacks

TODO:

Techniques

Streaming

Function/tool calling

Structured Output

LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide range of inputs, but for some use-cases, it can be useful to constrain the LLM's output to a specific format or structure. This is referred to as structured output.

Few-shot prompting

Retrieval

Text splitting

Evaluation

Tracing