microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank¶
Model Information¶
The microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank
model is part of the LLMLingua v2 framework and is optimized for prompt compression in meeting summarization and related tasks. It uses token-level importance prediction to preserve critical content while reducing input length by approximately 45%, enabling more efficient use of large language models.
- Model Developer: Microsoft
- Model Release Date: April 2024
- Supported Languages:English, Spanish, German, French, Chinese, Arabic, Russian, Japanese, Korean, Portuguese
Model Architecture¶
- Base Model: BERT-base-multilingual-cased
- Architecture Type: Transformer encoder
- Layers: 12
- Hidden Size: 768
- Attention Heads: 12
- Parameters: ~110M
- Training Objective: Token classification for prompt compression
- Compression Metric: Probability of token preservation (
p_preserve
)
Benchmark Scores¶
Task | Metric | Full Prompt | Compressed Prompt |
---|---|---|---|
Summarization | ROUGE-L | 43.1 | 42.8 |
QA | EM / F1 | 67.2 / 81.6 | 66.7 / 81.0 |
XQuAD (11 langs) | EM Average | 70.5 | 70.0 |
Translation | BLEU | 31.2 | 30.9 |
Compression Rate | Token Reduction | 0% | ~45% |
Evaluated on CNNDM, HotpotQA, XQuAD, and WMT En-De.