Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can quite easily ask it to summarize the result in a sentence or paragraph. LLMs have no other way to compute than write text and the more text they write the more compute they do. You only care about the final output.


I do. It typically goes on to write a preamble about why it gave a long answer before finally providing a summary. Token stuffing.


This is akin to complaining about the overhead of radix conversion between decimal and binary in early computers.

Unless you figure out how to talk in latent space directly to an llm the preamble is the cost you pay for getting the answer back in English.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: