Chat History Memory Module (`ChatMemoryModule`)¶

The Chat History Memory Module is designed to store and retrieve past conversation rounds, enabling your AI assistant to maintain context across interactions.

Purpose¶

Store previous conversation rounds to maintain context and provide coherent, contextually-aware responses.

Parameters¶

n_rounds (optional): Default number of maximum conversation rounds to retrieve. If not specified, defaults to 3. Can be overridden during retrieval. Must be a positive integer.
max_context (optional): Maximum total character count for retrieved conversation history. If not specified, defaults to 10,000 characters. Can be overridden during retrieval. Must be a positive integer.

Example¶

# To customize ChatMemoryModule, define memory_config and include it in the memory_modules. 
# Otherwise, omit this block to use defaults.
memory_config:                        
  memory_modules:                     
    - memory_name: chat_history       # Required. Unique identifier for this memory module.
      memory_class: ChatMemoryModule  # Required. Memory module class.
      config:                         # Optional. Specified per-module configuration.
        n_rounds: 3                   # Optional. Number of maximum conversation rounds. Must be a positive integer. Defaults to 3.
        max_context: 8000             # Optional. Maximum total character count. Must be a positive integer. Defaults to 10, 000.

Understanding Conversation Rounds¶

A conversation round is a fundamental concept in chat memory management. It represents a complete interaction cycle:

What is a Round?

One round = One user message + All subsequent agent/assistant responses before the next user message
Rounds help organize conversation history into logical interaction units
Each round starts with a user role message and includes all following messages until the next user message

Visual Example:

Round 1:
  user: "What is the weather today?"
  assistant: "Let me check the weather for you."
  weather_agent: "It's sunny and 72°F."

Round 2:
  user: "Should I bring an umbrella?"
  assistant: "Based on the sunny weather, you won't need an umbrella today."

Round 3:
  user: "Thanks!"
  assistant: "You're welcome! Have a great day!"

In this example:

Round 1 contains 3 messages (1 user + 2 agent responses)
Round 2 contains 2 messages (1 user + 1 agent response)
Round 3 contains 2 messages (1 user + 1 agent response)
Total: 3 rounds with 7 messages

Why Rounds Matter?

When you set n_rounds=2, you retrieve the last 2 complete interaction cycles (not 2 individual messages)
Rounds preserve the context of multi-agent conversations
Memory limits like n_rounds=5 mean "keep the last 5 user interactions and all their responses"

Understanding Character Limits¶

The chat history memory system manages conversation context using character-based limits (not token-based). When retrieving memory, you can control how much history is returned using the n_rounds parameter (limits the number of conversation rounds) and the max_context parameter (limits the total character count). See Parameters for default values and how to override them.

How Chat History Truncation Works¶

When the conversation history exceeds the specified limits, the system automatically manages the content:

Oldest-First Dropping: When multiple rounds don't fit within the character limit, older conversation rounds are dropped first
Front Truncation: If even a single round exceeds the character limit, the system keeps the most recent characters from that round, truncating from the beginning
Truncation Notice: When content is truncated, a notice is automatically prepended: "Notice: Chat history truncated due to maximum context window. "
Priority: More recent content is always prioritized to maintain the most relevant context

Handling Single Large Messages¶

When a single message exceeds the max_context limit, special truncation logic applies to preserve the most recent and relevant information:

How It Works:

The system first reserves space for the truncation notice (~65 characters)
Calculates the remaining budget: effective_budget = max_context - notice_length
If multiple messages exist in the round, older messages are dropped first
If only one message remains and still exceeds the limit, front truncation is applied:
The beginning of the message is removed
The last N characters are kept (where N = effective_budget)
The truncation notice is prepended to the kept portion