Usually, when people think about the prompt tokens for a chat model, the initial system prompt is the vast majority of the tokens and it's the same regardless for many usage modes. You might have a slightly different system prompt for code than you have for English or for chatting, but that is 3 prompts which you can permanently put in some sort of persistent KV cache. After that, only your specific request in that mode is uncached.
But if you're posting code, writing drafts, or even small snippets of articles, etc in there it becomes a huge problem.