I don't want to be dismissive, it's a fun project, but this has been done a lot already - maybe not with llama3 but the architecture is basically the same as llama2. Look at the big list of from scratch implementations on Karpathys llama2.c page.
Is there something particularly different about this one?
Well given the fast pace of AI, it should not be a surprise that this is similar to llama2 and that we’re seeing the n + 1 toy implementations and likely has bugs or leaks in the background.
You might as well look at llama.cpp for a serious and production grade implementation to learn from. Otherwise, nothing to see here.
> Is there something particularly different about this one?
Other than the immature lowercase, anime BS, etc, then…
Is there something particularly different about this one?
Edit - guess not?