When history is cached conversations tend not to be slower, because the LLM can ...

Fazebooking · 2026-01-16T13:12:05 1768569125

Thats only the case with KV Cache and we do not know how and how long providers keep it.

bluegatty · 2026-01-16T22:31:28 1768602688

Prefill is 10x faster than generation without caching, and 100x faster with caching - as a very crude measure. So it's not a matter of 'only the case'. Those are different scenarios. Some hosts are better than others with respect to managing caching, but the better one's provide decent SLA on that.