Why SillyTavern’s Summarize Extension Simply Isn’t Enough
When I first started using SillyTavern, I was truly taken aback by the number of options presented to the user compared to the broader market of AI-Chatbot websites. The flexibility of being able to tweak almost every single aspect of your experience is something that very little services offer (One of the closer ones being DreamGen). However, as my stories grew larger, and my tokens grew heftier (As will yours!), I began to research ways in which I could cut back on my tokenization in order to preserve my context. That’s when I switched to Qvinks Message Summarize Extension by qvink. It has proven to be an invaluable tool when it comes to preserving memory over long, bigger-than-life narratives. Below are some brief explanations, as well as an in-depth guide to getting started. Let’s begin.
Where SillyTavern’s Summarizer Falls Short
SillyTavern’s default summarization extension comes with one key caveat. If a key character detail or nuance was established mid-story, the summarizer may drop it entirely. This is because Summarization takes the entirety of your story and compresses it into a single, long summary, leading it to miss key details, such as specific character nuances or events. Unfortunately, the issue is exacerbated by AI’s “Lost in the Middle” phenomenon. LLMs naturally prioritize information in the beginning and at the end of the prompt. As a result, key story details in the middle are missed, which is an odd quirk. (Imagine only eating the buns off a burger!) Luckily, Qvink’s Message Summarize extension takes a different approach to solve this problem.
How Qvink Solves the Problem.
The way Qvink’s extension approaches this memory dilemma is different, but much more intuitive. Instead of summarizing the entirety of the story, it focuses on summarizing individual messages. This makes it much more accurate when it comes to remembering key details and character nuances. In fact, if you’ve used SpicyChat’s proprietary “Semantic Memory”, this will sound very familiar, as it is similar in principle. Memories from summarizing this way are very granular and moldable in essence. On top of that, you no longer need to re-summarize your entire prompt to capture key details. You only have to make changes to the specific message the events occurred in. You also have control over frequency at which summarization occurs. In fact, most users opt to summarize every 2-4 messages. Doing it this way increases quality and accuracy while sending fewer API calls.

As you can see above, an entire paragraph was summarized into 1-3 sentences, emphasizing the important takeaways. This reduces token usage and allows context to be allocated to more important things, such as lorebooks. You can also edit this memory directly, adding or removing details as desired.
Setting up Qvink’s Message Summarize
The setup for Qvink is actually quite straightforward. Let’s get started.
- Download Qvink’s MessageSummarize extension here.