The Curse of Forgetting

Why SillyTavern’s Summarize Extension Simply Isn’t Enough

When I first started using SillyTavern, I was truly taken aback by the number of options presented to the user compared to the broader market of AI-Chatbot websites. The flexibility of being able to tweak almost every single aspect of your experience is something that very little services offer (One of the closer ones being DreamGen). However, as my stories grew larger, and my tokens grew heftier (As will yours!), I began to research ways in which I could cut back on my tokenization in order to preserve my context. That’s when I switched to Qvinks Message Summarize Extension by qvink. It has proven to be an invaluable tool when it comes to weaving long, bigger-than-life narratives.

Where SillyTavern’s Summarizer Falls Short

SillyTavern’s default summarization extension comes with a few caveats. The most prominent one is its tendency to miss important details in your story. This is partly as a result of its approach. Summarization takes the entirety of your story and compresses it into a single, long summary. This causes it to miss key details, such as specific character nuances or events. This is further exacerbated by AI’s “Lost in the Middle” phenomenon, in which the LLM will prioritize information in the beginning and at the end of the prompt, but miss out on key details in the middle. Qvink’s Message Summarize takes a different approach.

How Qvink Solves the Problem.

The way Qvink’s extension approaches this memory dilemma is different, but much more intuitive. Instead of summarizing the entirety of the story so far, it focuses on summarizing each individual message. This makes it much more accurate and less likely to miss out on any important details. The result is memory that is very granular and moldable in essence. When you make changes or re-summarize a specific message, you are only making changes to that message. This really shines in long narratives! Instead of re-summarizing your entire prompt, you only have to make changes to the event itself. One thing to note is that you don’t have to do it after every message. Rather, most users opt to summarize every 2-4 messages. This still preserves quality and accuracy while sending less API calls, especially useful when using a paid LLM.