Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Two major ones: you now need to handle all cache issues like invalidation (what happens when you want to upgrade or the model improves?), and you also now need to think about security issues - given the drastic timing differences, anyone can probe your cache to figure out what calls have been made and extract anything in prompts like passwords or PII (eg. just going token by token and trying the top possibilities each time).


> happens when you want to upgrade or the model improves

I was thinking to prefix the key of each inquiry with model_gpt3.5-1000 or whatever model returned that.

> anyone can probe your cache to figure out what calls have been made

My use-case is local-only[0], where each user sends the requests from their own machine. I could maybe cache by default and add some info showing the request was returned from cache, alongside a button to "force regenerate answer".

[0]: https://docs.uxwizz.com/guides/ask-ai-new




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: