Chat History Memory Module (ChatMemoryModule)¶
The Chat History Memory Module is designed to store and retrieve past conversation rounds, enabling your AI assistant to maintain context across interactions.
Purpose¶
Store previous conversation rounds to maintain context and provide coherent, contextually-aware responses.
Parameters¶
n_rounds(optional): Default number of maximum conversation rounds to retrieve. If not specified, defaults to 3. Can be overridden during retrieval. Must be a positive integer.max_context(optional): Maximum total character count for retrieved conversation history. If not specified, defaults to 10,000 characters. Can be overridden during retrieval. Must be a positive integer.
Example¶
# To customize ChatMemoryModule, define memory_config and include it in the memory_modules.
# Otherwise, omit this block to use defaults.
memory_config:
memory_modules:
- memory_name: chat_history # Required. Unique identifier for this memory module.
memory_class: ChatMemoryModule # Required. Memory module class.
config: # Optional. Specified per-module configuration.
n_rounds: 3 # Optional. Number of maximum conversation rounds. Must be a positive integer. Defaults to 3.
max_context: 8000 # Optional. Maximum total character count. Must be a positive integer. Defaults to 10, 000.
Understanding Conversation Rounds¶
A conversation round is a fundamental concept in chat memory management. It represents a complete interaction cycle:
What is a Round?
- One round = One user message + All subsequent agent/assistant responses before the next user message
- Rounds help organize conversation history into logical interaction units
- Each round starts with a
userrole message and includes all following messages until the next user message
Visual Example:
Round 1:
user: "What is the weather today?"
assistant: "Let me check the weather for you."
weather_agent: "It's sunny and 72°F."
Round 2:
user: "Should I bring an umbrella?"
assistant: "Based on the sunny weather, you won't need an umbrella today."
Round 3:
user: "Thanks!"
assistant: "You're welcome! Have a great day!"
In this example:
- Round 1 contains 3 messages (1 user + 2 agent responses)
- Round 2 contains 2 messages (1 user + 1 agent response)
- Round 3 contains 2 messages (1 user + 1 agent response)
- Total: 3 rounds with 7 messages
Why Rounds Matter?
- When you set
n_rounds=2, you retrieve the last 2 complete interaction cycles (not 2 individual messages) - Rounds preserve the context of multi-agent conversations
- Memory limits like
n_rounds=5mean "keep the last 5 user interactions and all their responses"
Understanding Character Limits¶
The chat history memory system manages conversation context using character-based limits (not token-based). When retrieving memory, you can control how much history is returned using the n_rounds parameter (limits the number of conversation rounds) and the max_context parameter (limits the total character count). See Parameters for default values and how to override them.
How Chat History Truncation Works¶
When the conversation history exceeds the specified limits, the system automatically manages the content:
- Oldest-First Dropping: When multiple rounds don't fit within the character limit, older conversation rounds are dropped first
- Front Truncation: If even a single round exceeds the character limit, the system keeps the most recent characters from that round, truncating from the beginning
- Truncation Notice: When content is truncated, a notice is automatically prepended:
"Notice: Chat history truncated due to maximum context window. " - Priority: More recent content is always prioritized to maintain the most relevant context
Handling Single Large Messages¶
When a single message exceeds the max_context limit, special truncation logic applies to preserve the most recent and relevant information:
How It Works:
- The system first reserves space for the truncation notice (~65 characters)
- Calculates the remaining budget:
effective_budget = max_context - notice_length - If multiple messages exist in the round, older messages are dropped first
- If only one message remains and still exceeds the limit, front truncation is applied:
- The beginning of the message is removed
- The last N characters are kept (where N = effective_budget)
- The truncation notice is prepended to the kept portion
Visual Example:
Suppose you have max_context=1000 and a single message with 2000 characters:
Original message (2000 chars):
"The 2022 FIFA World Cup in Qatar featured 32 teams competing across multiple stages.
[...middle content...]
Argentina ultimately defeated France in a dramatic penalty shootout to claim the title."
After truncation (fits within 1000 chars):
"Notice: Chat history truncated due to maximum context window. ...across multiple stages.
The knockout rounds featured upsets, with Morocco reaching the semi-finals. Argentina
ultimately defeated France in a dramatic penalty shootout to claim the title."
│ │
│ │
└─ Truncation notice (~65 chars) └─ Last ~935 chars preserved
The beginning is removed, but the conclusion and outcome are preserved.