Hacker Newsnew | past | comments | ask | show | jobs | submit | jkarneges's commentslogin

The HN/Firebase API doesn't make this easy. For https://hnstream.com I ended up crawling items to find the article.


Any tips on respectfully crawling HN so you don’t get throttled? I had an application idea that could not be served by the API (need karma values) so I started to write code to scrape but got rate limited pretty quickly.


I've had no trouble hitting the Firebase API at the speed items are created, with a 5 second delay between retries.

For scraping HN directly, in my experience you have to go extremely slow, like 1 minute between fetching items. And if you get blocked, it may be better to wait a long time (minutes) before trying again rather than exponential backoff, in order to get out of the penalty box. You'll need a cache for sure.


The comments don't even have a thread ID?


Comment items look like https://hacker-news.firebaseio.com/v0/item/45533616.json?pri...:

  {
    "by" : "jkarneges",
    "id" : 45533018,
    "kids" : [ 45533616 ],
    "parent" : 45532549,
    "text" : "The HN&#x2F;Firebase API doesn&#x27;t make this easy. For <a href=\"https:&#x2F;&#x2F;hnstream.com\" rel=\"nofollow\">https:&#x2F;&#x2F;hnstream.com</a> I ended up   crawling items to find the article.",
    "time" : 1760043552,
    "type" : "comment"
  }
"parent" can either be the actual parent comment or the parent article, depending where in the comment chain you are.


Perhaps @kogir, who was active on https://github.com/HackerNews/API could add the thread id.



As does hnstream.com from the sourced sample comment itself. Both just traverse the parent id until it's the root (article). It takes more queries, but the API is not rate limited.


It wouldn't take more queries if the comments were cached. It could probably be done entirely in memory, HN's entire corpus can't be that large.

If one were to start at the page endpoints (eg /topstories) one could add references to origin ids while preloading comments, and probably cover the most likely to be referenced ID, and even make traversal up the tree even more efficient.


Congrats on the project! You may be right. There are other SSE services, but I can't think of one that allows clients to subscribe without authentication.

Not requiring client auth certainly makes things simple. It can even work for private data if the topics are sufficiently unguessable.


> the kids who grew up in those homes are writing things that take place there

This is kind of like how trench coats are associated with detectives, because they were regular clothing for anyone around the time of early detective films.


I agree it has some problems. For now, it is mostly a UX proof-of-concept and probably not how an official poll should be conducted.

> What problem is this envisioned as solving?

Its core mission is to legitimize all candidates on the ballot. This is something caucuses and ranked-choice voting can do, but since our general elections don't work this way, I wonder if the voting experience could be augmented from the private sector. (Of course, efforts to change how our actual elections work is still worthwhile and can be pursued in parallel).

Basically, if enough people (millions) were to use an app like this to meta-vote before committing to a single actual vote, we could simulate alternative voting processes without government involvement.


It's a proof-of-concept for a potential startup idea, not endorsed by my employer. I simply picked tech in arms reach.

Maybe in 2028 this could be a real thing, regardless of where it is running. An edge cloud does seem ideal for national live events though.


> An edge cloud does seem ideal for national live events though

OK, you almost had me believing you but "edge cloud" is 100% Fastly marketing speech. Any time I see somebody mention using Fastly, they turn out to be a Fastly employee. You've posted this demo three times already, give it a rest.


Hmm. Yes, using Fastly for that. Can you go to https://ileantoward.com/geotest and see if anything looks fishy? Notably country_code and region (region should have a state code if country is "US").


Says I'm in El Salvador (Country Code: SV)

I don't have much confidence in Fastly's WAF if it can't do basic geolocation in 2024


> Does Fastly support WebSockets yet?

It does! Can be served at the edge or passed through to origin.


According to the docs WebSockets are still incompatible with WAF and Origin Shield...

https://docs.fastly.com/products/websockets

If I'm not using Origin Shield, and I can't use WAF, why would I bother with Fastly when AWS Cloudfront is a fraction of the cost (And has WebSockets): https://docs.aws.amazon.com/AmazonCloudFront/latest/Develope...

I just don't understand what Fastly has been doing for the past few years, it seems like they're just selling bandwidth these days.


I'm not sure there'd be much benefit to using shielding with WebSockets, since the traffic wouldn't be cached/collapsed. You can still shield HTTP traffic on the same domain being used for WebSockets.

WAF could be nice though.


> almost no one on their death bed wishes they had spent more time working / on their computer

And old age isn’t needed to figure this out. Even middle age will do. When I reflect on my life, I almost never think about past work.

(Yet, I remain mostly working.)


facepalm


Does HN skew democratic? Maybe so. Still interesting to see the breakdown.


Was that a serious question? If so, a quick perusal of the archives will show you where the flag hangs. Just look for all the greyed-out posts and you'll know where the Overton window is to be found.


lol, dude. If you don’t immediately see it, that is a clue of where you are yourself. In case you were wondering.


"Does the Pope poop in the woods?"

Wait... I think I mixed something up with rejoinder..


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: