Trying to build the world's largest encyclopedia of magazines, fanzines, journals and newsletters. Most of the world's magazines are dead and buried, languishing as copyright orphans with no known owner. Yet, they contain far more knowledge than probably all the books combined?
We should probably think about training the next LLM on all the world's magazines as well as all the books.
It is less than half-baked. I asked GPT and it said the project was "barely even kneaded", which is perfect.
I hack on it every day. I have hundreds of thousands of magazines to upload and there are many millions more hiding in obscure parts of the Internet, already scanned and waiting to be found. Plus all the amazing ones which haven't even been scanned. I intend to set up a non-profit to scan all the ones I can get my hands on.
If there's a specific magazine from days of yore that you're looking for, I might have it nestled away somewhere, so just drop me a line here and I'll try to find it for you:
hello@magazedia.wiki
I love this one! It has occurred to me that while there is a wealth of information online, there is perhaps even more information in print that takes a staggering amount of work to digitize and organize sensibly. It's very interesting to come across someone actively working on a project like this.
An idea I had recently that's vaguely related is that it would be really cool to try to put together a massive product information catalog/database. Maybe it could be maintained similar to Wikipedia in terms of editing/review/adding new content - the idea being that it should be as objective, unbiased, and complete as possible. The impetus for this thought was of course that I hate advertising and on some level think it shouldn't even exist, so I was thinking of what an alternative might look like.
I am actually working on that at the same time. I'm using the same wiki engine I'm building for Magazedia to build the bigger site which collates every product ever made. Which is a large task, let me tell you...
What! That is an incredible coincidence to run across your comment in the same week I've been thinking about that. I can only imagine it is a monumental undertaking - I personally filed it under "cool ideas I'm hopelessly incapable of attempting for now," as I'm currently just an analyst/half-ass analytics engineer (pivoter) with far more enthusiasm than skill at programming.
Would be glad to hear more about it - your basic approach, some of the biggest challenges, whatever it might be. Shoot me a message if the mood strikes!
EDIT: just created my account recently and did not realize there is no PM functionality on HN, doh.
Really cool. When I was younger I regularly got one of those (Not sure of the name) bundles of pages that you tear out and arrange in binders. It was all about aircraft including cross sections and I've always wanted to scan and ocr them.
Yes, these things were huge in the 80s and early 90s. "Part works" is the official term for them. I'm trying to track down some of the ones I loved from my childhood too. I wonder how well some of them hold up?
Absolutely. I see the IA as more of a bulk storage area, with little community, and my site will have more commentary and community built around it. But everything that gets scanned for my site will also be available for the IA too.
This was the expiry date, so I'm not sure what's going on because that is in UTC and it is not that time yet: (of course I should have set up an alert for myself to do it a week ago)
We should probably think about training the next LLM on all the world's magazines as well as all the books.
It is less than half-baked. I asked GPT and it said the project was "barely even kneaded", which is perfect.
I hack on it every day. I have hundreds of thousands of magazines to upload and there are many millions more hiding in obscure parts of the Internet, already scanned and waiting to be found. Plus all the amazing ones which haven't even been scanned. I intend to set up a non-profit to scan all the ones I can get my hands on.
https://en.magazedia.wiki/byte-volume-1-issue-1-september-19...
(registration is broken, so don't try that lol)
If there's a specific magazine from days of yore that you're looking for, I might have it nestled away somewhere, so just drop me a line here and I'll try to find it for you: hello@magazedia.wiki