> However, skills are different from MCP. Skills has nothing to do with tool calling at all
Although skills require that you have certain tools available like basic file system operations so the model can read the skills files. Usually this is implemented as ephemeral "sandbox environment" where LLM have access to file system and can also execute python, run bash commands etc.
We're doing something similar. We first chunk the documents based on h1,h2,h3 headings. Then we add headers in the beginning of the chunk as a context. As an imagenary example, instead of one chunk being:
The usual dose for adults is one or two 200mg tablets or
capsules 3 times a day.
It is now something like:
# Fever
## Treatment
---
The usual dose for adults is one or two 200mg tablets or
capsules 3 times a day.
This seems to work pretty well, and doesn't require any LLMs when indexing documents.
I used to always wonder how do llms know whether a particular long article or audio transcript was written by say Alan Watts. Basically these kind of metadata annotation would be common while preparing training data for Llama models and so on. This could also be reason for the genesis for the argument that ChatGPT got slower in December. That "date" metadata would "inform" ChatGPT to be unhelpful.
I am working on question answering based on long documents / bundles of documents, 100+ pages, and I took a similar approach. I first summarize each page, give it a title and extract a list of subsections. Then I put all the summaries together and I ask the model to provide a hierarchical index. It will organize the whole bundle into a tree. At querying time I combine the path in the tree as additional context.
So regex version still beats the LLM solution. There's also the risk of hallucinations. I wonder if they tried to make SML which would rewrite or update the existing regex solution instead of generating the whole content again? This would mean less output tokens, faster inference and output wouldn't contain hallucinations. Although, not sure if small language models are capabable to write regex
We use the term "pre-googling" for this sort of "information retrieval". You might have some concept in your head and you want to know the exact term for it, once you get the term you're looking for from LLM you'll move to Google and search the "facts".
This might be a weird example for native english speakers but recently I just couldn't remember the term for graph where you're allowed to move in one direction and cannot do loops. LLM gave me the answer (directed acyclic graph or DAG)right away. Once I got the term I was looking for I moved on to Google search.
Same "pre-googling" works if you don't know if some concept exits.
I recently started watching fallout and it reminded me of a book I read about a future religious order which was piecing together pre-bomb scientific knowledge. It immediately pointed me to the Canticles of Leibovitz (which is great btw). Google results will do the same, but llm I’d much faster and more direct.
I find it great for stuff like this - where you know there is an answer and will recognise it as soon as you see it. I genuinely think it can become an extension of my long-term memory, but I’m slightly nervous about the effect it will have on my actual non-memory if I just don’t need to remember stuff like this anymore!
There is no chunking built into the postgres extension yet, but we are working on it.
It does check the context length of the request against the limits of the chat model before sending the request, and optionally allows you to auto-trim the least relevant documents out of the request so that it fits the model's context window. IMO its worth spending time getting chunks prepared, sized, tuned for your use case though. There are some good conversations above discussing methods around this, such as using a summarization model to create the chunks.
I commented on another thread about cr-sqlite [0]. In addition to that, I believe Mycelial is VC funded, whereas Matt has Github sponsors. I hope there's a future where he teams with, say, Fly.io to make cr-sqlite sustainable.
Although skills require that you have certain tools available like basic file system operations so the model can read the skills files. Usually this is implemented as ephemeral "sandbox environment" where LLM have access to file system and can also execute python, run bash commands etc.