> but it’s still cheaper than doing llama2 yourself
How you calculate this? Unless you're factoring in acquiring the hardware, you can usually get away with outsourcing the training of llama2 to rented hardware, and then run it on owned hardware, so with lots of executions, using llama2 locally should most definitely be cheaper in the medium to long term, compared to paying for the training + execution of fine tuned GPT3.5