I really hope to see more research like this. I feel like LLMs have insanely high pedagogic value, and that's because I've been using them to teach myself difficult subjects and review my understanding.
The issue with the kind of system that the author is proposing for curious self-driven learning is that the SRS is optimized for a given curriculum.
Many people including myself use flashcards to guide (and sort of "create") their own curriculum. Flashcards with SRS are really good for many things, but it's difficult to generalize them for thus usecase.
I'd really like to see some prototypes of people integrating LLM intelligence in creating and adjusting cards on the fly. It's clearly something LLMs excel at, especially the larger closed-source ones like Claude 3.5 sonnet
I really hope to see more research like this. I feel like LLMs have insanely high pedagogic value, and that's because I've been using them to teach myself difficult subjects and review my understanding.
For mathy subjects you usually have to verify things yourself so you'd detect any mistakes as you try to work things out or makes sense of the information. You can't learn math, physics or theory for engineering by remembering and believing things. It would be different for "bag of random facts" subjects though.
I don't use an LLM in isolation. I use it alongside reading material, books, papers etc.
Since I'm using it in involved studying sessions, it is totally sensible to cross-check most of what the LLM outputs, since that's part of the learning process anyway!
In my experience learning design, some math, basic android dev with local and private models, they are very reliable if you use them "in the loop" (ie, as part of a process).
During programming, it's extremely easy to verify what the LLM says is correct by running code.
For math, this is a bit more difficult, but so far it's only been surface level math. I know that LLMs are good at teaching concepts and ideas in math, but not at _doing_ math. I don't fully trust an LLM to teach me more advanced math because I have no way of verifying what I learn is
* Correct
* Relevant
* The "right" way of learning it
For design, which I'm currently studying a lot for work and personal projects, it does well. I'm using Claude to help me solidify my understanding by asking it to critique my summaries and quiz me on topics. I know Claude is correct because I'm using it to solidify my understanding and how topics interrelate, but not to learn new topics. I already sort of "intuitively understand" these topics, but am training myself to go a bit deeper.
They work really well, and I really do understand the skepticism, but in practice it's unwarranted.
Hallucination isn’t really a problem when you can verify the ground truth. For example if you’re using it as a companion to a Khan Academy math problem with immediate feedback.
This fear that LLMs are hallucinating constantly, and therefore entirely unreliable, is unfounded.
Sure, that'll happen if you're going into extreme limits of areas of knowledge - but if you're dipping your foot into superficial waters of known areas, it's reasonably fine (and perhaps even more sound than the hotter areas of Wikipedia).
E.g. if you're programming in the most well-known languages (say Python), and you're only asking beginner questions (which is where high pedagogic value lies) it's unlikely that the program will hallucinate all that deeply to mislead you. Yes, you won't know, but simple/mid questions are easy to check.
You shouldn't of course head to LLMs for the most arcane scientific/mathematical knowledge and conclusions; if you do that, that's on you, because offering expertise is not what LLMs are designed for, and the disclaimers all warn against it. But you shouldn't throw out the baby with the bathwater.
The issue with the kind of system that the author is proposing for curious self-driven learning is that the SRS is optimized for a given curriculum.
Many people including myself use flashcards to guide (and sort of "create") their own curriculum. Flashcards with SRS are really good for many things, but it's difficult to generalize them for thus usecase.
I'd really like to see some prototypes of people integrating LLM intelligence in creating and adjusting cards on the fly. It's clearly something LLMs excel at, especially the larger closed-source ones like Claude 3.5 sonnet