Summarization
Use case​
Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content.
LLMs are a great tool for this given their proficiency in understanding and synthesizing text.
In this walkthrough we'll go over how to perform document summarization using LLMs.
Overview​
A central question for building a summarizer is how to pass your documents into the LLM's context window. Two common approaches for this are:
Stuff
: Simply "stuff" all your documents into a single prompt. This is the simplest approach (see here for more on theStuffDocumentsChains
, which is used for this method).Map-reduce
: Summarize each document on it's own in a "map" step and then "reduce" the summaries into a final summary (see here for more on theMapReduceDocumentsChain
, which is used for this method).
Quickstart​
To give you a sneak preview, either pipeline can be wrapped in a single object: load_summarize_chain
.
Suppose we want to summarize a blog post. We can create this in a few lines of code.
First set environment variables and install packages:
pip install openai tiktoken chromadb langchain
# Set env var OPENAI_API_KEY or load from a .env file
# import dotenv
# dotenv.load_dotenv()
We can use chain_type="stuff"
, especially if using larger context window models such as:
- 16k token OpenAI
gpt-3.5-turbo-16k
- 100k token Anthropic Claude-2
We can also supply chain_type="map_reduce"
or chain_type="refine"
(read more here).
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import WebBaseLoader
from langchain.chains.summarize import load_summarize_chain
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo-16k")
chain = load_summarize_chain(llm, chain_type="stuff")
chain.run(docs)
API Reference:
'The article discusses the concept of building autonomous agents powered by large language models (LLMs). It explores the components of such agents, including planning, memory, and tool use. The article provides case studies and proof-of-concept examples of LLM-powered agents in various domains. It also highlights the challenges and limitations of using LLMs in agent systems.'
Option 1. Stuff​
When we use load_summarize_chain
with chain_type="stuff"
, we will use the StuffDocumentsChain.
The chain will take a list of documents, inserts them all into a prompt, and passes that prompt to an LLM:
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
# Define prompt
prompt_template = """Write a concise summary of the following:
"{text}"
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)
# Define LLM chain
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo-16k")
llm_chain = LLMChain(llm=llm, prompt=prompt)
# Define StuffDocumentsChain
stuff_chain = StuffDocumentsChain(
llm_chain=llm_chain, document_variable_name="text"
)
docs = loader.load()
print(stuff_chain.run(docs))
API Reference:
The article discusses the concept of building autonomous agents powered by large language models (LLMs). It explores the components of such agents, including planning, memory, and tool use. The article provides case studies and examples of proof-of-concept demos, highlighting the challenges and limitations of LLM-powered agents. It also includes references to related research papers and provides a citation for the article.
Great! We can see that we reproduce the earlier result using the load_summarize_chain
.
Go deeper​
- You can easily customize the prompt.
- You can easily try different LLMs, (e.g., Claude) via the
llm
parameter.
Option 2. Map-Reduce​
Let's unpack the map reduce approach. For this, we'll first map each document to an individual summary using an LLMChain
. Then we'll use a ReduceDocumentsChain
to combine those summaries into a single global summary.
First, we specfy the LLMChain to use for mapping each document to an individual summary:
from langchain.chains.mapreduce import MapReduceChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain
llm = ChatOpenAI(temperature=0)
# Map
map_template = """The following is a set of documents
{docs}
Based on this list of docs, please identify the main themes
Helpful Answer:"""
map_prompt = PromptTemplate.from_template(map_template)
map_chain = LLMChain(llm=llm, prompt=map_prompt)
We can also use the Prompt Hub to store and fetch prompts.
This will work with your LangSmith API key.
For example, see the map prompt here.
from langchain import hub
map_prompt = hub.pull("rlm/map-prompt")
map_chain = LLMChain(llm=llm, prompt=map_prompt)
The ReduceDocumentsChain
handles taking the document mapping results and reducing them into a single output. It wraps a generic CombineDocumentsChain
(like StuffDocumentsChain
) but adds the ability to collapse documents before passing it to the CombineDocumentsChain
if their cumulative size exceeds token_max
. In this example, we can actually re-use our chain for combining our docs to also collapse our docs.
So if the cumulative number of tokens in our mapped documents exceeds 4000 tokens, then we'll recursively pass in the documents in batches of < 4000 tokens to our StuffDocumentsChain
to create batched summaries. And once those batched summaries are cumulatively less than 4000 tokens, we'll pass them all one last time to the StuffDocumentsChain
to create the final summary.
# Reduce
reduce_template = """The following is set of summaries:
{doc_summaries}
Take these and distill it into a final, consolidated summary of the main themes.
Helpful Answer:"""
reduce_prompt = PromptTemplate.from_template(reduce_template)
# Note we can also get this from the prompt hub, as noted above
reduce_prompt = hub.pull("rlm/map-prompt")
# Run chain
reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)
# Takes a list of documents, combines them into a single string, and passes this to an LLMChain
combine_documents_chain = StuffDocumentsChain(
llm_chain=reduce_chain, document_variable_name="doc_summaries"
)
# Combines and iteravely reduces the mapped documents
reduce_documents_chain = ReduceDocumentsChain(
# This is final chain that is called.
combine_documents_chain=combine_documents_chain,
# If documents exceed context for `StuffDocumentsChain`
collapse_documents_chain=combine_documents_chain,
# The maximum number of tokens to group documents into.
token_max=4000,
)
Combining our map and reduce chains into one:
# Combining documents by mapping a chain over them, then combining results
map_reduce_chain = MapReduceDocumentsChain(
# Map chain
llm_chain=map_chain,
# Reduce chain
reduce_documents_chain=reduce_documents_chain,
# The variable name in the llm_chain to put the documents in
document_variable_name="docs",
# Return the results of the map steps in the output
return_intermediate_steps=False,
)
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)
Created a chunk of size 1003, which is longer than the specified 1000
print(map_reduce_chain.run(split_docs))
The main themes identified in the provided set of documents are:
1. LLM-powered autonomous agent systems: The documents discuss the concept of building autonomous agents with large language models (LLMs) as the core controller. They explore the potential of LLMs beyond content generation and present them as powerful problem solvers.
2. Components of the agent system: The documents outline the key components of LLM-powered agent systems, including planning, memory, and tool use. Each component is described in detail, highlighting its role in enhancing the agent's capabilities.
3. Planning and task decomposition: The planning component focuses on task decomposition and self-reflection. The agent breaks down complex tasks into smaller subgoals and learns from past actions to improve future results.
4. Memory and learning: The memory component includes short-term memory for in-context learning and long-term memory for retaining and recalling information over extended periods. The use of external vector stores for fast retrieval is also mentioned.
5. Tool use and external APIs: The agent learns to utilize external APIs for accessing additional information, code execution, and proprietary sources. This enhances the agent's knowledge and problem-solving abilities.
6. Case studies and proof-of-concept examples: The documents provide case studies and examples to demonstrate the application of LLM-powered agents in scientific discovery, generative simulations, and other domains. These examples serve as proof-of-concept for the effectiveness of the agent system.
7. Challenges and limitations: The documents mention challenges associated with building LLM-powered autonomous agents, such as the limitations of finite context length, difficulties in long-term planning, and reliability issues with natural language interfaces.
8. Citation and references: The documents include a citation and reference section for acknowledging the sources and inspirations for the concepts discussed.
Overall, the main themes revolve around the development and capabilities of LLM-powered autonomous agent systems, including their components, planning and task decomposition, memory and learning mechanisms, tool use and external APIs, case studies and proof-of-concept examples, challenges and limitations, and the importance of proper citation and references.
Go deeper​
Customization
- As shown above, you can customize the LLMs and prompts for map and reduce stages.
Real-world use-case
- See this blog post case-study on analyzing user interactions (questions about LangChain documentation)!
- The blog post and associated repo also introduce clustering as a means of summarization.
- This opens up a third path beyond the
stuff
ormap-reduce
approaches that is worth considering.
Option 3. Refine​
Refine is similar to map-reduce:
The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.
This can be easily run with the chain_type="refine"
specified.
chain = load_summarize_chain(llm, chain_type="refine")
chain.run(split_docs)
'The GPT-Engineer project aims to create a repository of code for specific tasks specified in natural language. It involves breaking down tasks into smaller components and seeking clarification from the user when needed. The project emphasizes the importance of implementing every detail of the architecture as code and provides guidelines for file organization, code structure, and dependencies. However, there are challenges in long-term planning and task decomposition, as well as the reliability of the natural language interface. The system has limited communication bandwidth and struggles to adjust plans when faced with unexpected errors. The reliability of model outputs is questionable, as formatting errors and rebellious behavior can occur. The conversation also includes instructions for writing the code, including laying out the core classes, functions, and methods, and providing the code in a markdown code block format. The user is reminded to ensure that the code is fully functional and follows best practices for file naming, imports, and types. The project is powered by LLM (Large Language Models) and incorporates prompting techniques from various research papers.'
It's also possible to supply a prompt and return intermediate steps.
prompt_template = """Write a concise summary of the following:
{text}
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)
refine_template = (
"Your job is to produce a final summary\n"
"We have provided an existing summary up to a certain point: {existing_answer}\n"
"We have the opportunity to refine the existing summary"
"(only if needed) with some more context below.\n"
"------------\n"
"{text}\n"
"------------\n"
"Given the new context, refine the original summary in Italian"
"If the context isn't useful, return the original summary."
)
refine_prompt = PromptTemplate.from_template(refine_template)
chain = load_summarize_chain(
llm=llm,
chain_type="refine",
question_prompt=prompt,
refine_prompt=refine_prompt,
return_intermediate_steps=True,
input_key="input_documents",
output_key="output_text",
)
result = chain({"input_documents": split_docs}, return_only_outputs=True)
print(result["output_text"])
L'articolo discute il concetto di costruire agenti autonomi utilizzando LLM (large language model) come controller principale. Esplora i diversi componenti di un sistema di agenti alimentato da LLM, inclusa la pianificazione, la memoria e l'uso di strumenti. Dimostrazioni di concetto come AutoGPT mostrano la possibilità di creare agenti autonomi con LLM come controller principale. Approcci come Chain of Thought, Tree of Thoughts, LLM+P, ReAct e Reflexion consentono agli agenti autonomi di pianificare, riflettere su se stessi e migliorare iterativamente. Tuttavia, ci sono sfide legate alla lunghezza del contesto, alla pianificazione a lungo termine e alla decomposizione delle attività . Inoltre, l'affidabilità dell'interfaccia di linguaggio naturale tra LLM e componenti esterni come la memoria e gli strumenti è incerta. Nonostante ciò, l'uso di LLM come router per indirizzare le richieste ai moduli esperti più adatti è stato proposto come architettura neuro-simbolica per agenti autonomi nel sistema MRKL. L'articolo fa riferimento a diverse pubblicazioni che approfondiscono l'argomento, tra cui Chain of Thought, Tree of Thoughts, LLM+P, ReAct, Reflexion, e MRKL Systems.
print("\n\n".join(result["intermediate_steps"][:3]))
This article discusses the concept of building autonomous agents using LLM (large language model) as the core controller. The article explores the different components of an LLM-powered agent system, including planning, memory, and tool use. It also provides examples of proof-of-concept demos and highlights the potential of LLM as a general problem solver.
Questo articolo discute del concetto di costruire agenti autonomi utilizzando LLM (large language model) come controller principale. L'articolo esplora i diversi componenti di un sistema di agenti alimentato da LLM, inclusa la pianificazione, la memoria e l'uso degli strumenti. Vengono anche forniti esempi di dimostrazioni di proof-of-concept e si evidenzia il potenziale di LLM come risolutore generale di problemi. Inoltre, vengono presentati approcci come Chain of Thought, Tree of Thoughts, LLM+P, ReAct e Reflexion che consentono agli agenti autonomi di pianificare, riflettere su se stessi e migliorare iterativamente.
Questo articolo discute del concetto di costruire agenti autonomi utilizzando LLM (large language model) come controller principale. L'articolo esplora i diversi componenti di un sistema di agenti alimentato da LLM, inclusa la pianificazione, la memoria e l'uso degli strumenti. Vengono anche forniti esempi di dimostrazioni di proof-of-concept e si evidenzia il potenziale di LLM come risolutore generale di problemi. Inoltre, vengono presentati approcci come Chain of Thought, Tree of Thoughts, LLM+P, ReAct e Reflexion che consentono agli agenti autonomi di pianificare, riflettere su se stessi e migliorare iterativamente. Il nuovo contesto riguarda l'approccio Chain of Hindsight (CoH) che permette al modello di migliorare autonomamente i propri output attraverso un processo di apprendimento supervisionato. Viene anche presentato l'approccio Algorithm Distillation (AD) che applica lo stesso concetto alle traiettorie di apprendimento per compiti di reinforcement learning.