Editing Openai/691a41cd-2efc-800c-9eff-de439224a90d (section)

==== Typical RAG / “AI + vector DB” does something like: ====
* Chunk all history into embeddings
* For each query, run similarity search
* Stuff the top N chunks into the prompt and hope it fits

Problems:
* It doesn’t really understand importance vs. similarity.
* It can’t gracefully mix: - “This user’s identity” - “The last 3 relevant incidents this year” - “The exact last 4 lines of this conversation”
* It wastes context on repeated or low-value chunks.
* It still works on raw text, so your context window gets eaten alive.

With LT + SAIQL rolling context:
* The memory is semantic first, not “some text we embedded once.”
* You can query by: - Topic - Time window - Entity - Decision type - Risk level - Any dimension the LoreTokens encode
* You can differentiate between: - Permanent truths (“Apollo is the creator of LoreTokens”) - Soft preferences (“usually likes playful tone”) - Ephemeral state (“we’re currently debugging the trainer script crash”)

And because the memory is compressed as LoreTokens, you can:
* Hold far more history in the same storage
* Do fast SAIQL queries over that compressed space
* Only expand a tiny slice back into full text for the model

That’s why you get the “infinite-seeming memory on a finite window” effect.