Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If the content is literally the same, the crawler should be able to use If-Modified-Since, right? It still has to make a HTTP request, but not parse or index anything.


If the content is dynamic (e.g. a list of popular articles in a sidebar has changed), then the page will be considered "updated".


This is not correct. It’s up to the server, controlled by the application to send that or other headers. Similar to sending a <title> tag. The headers take priority and similar to what another person said they will do a HEAD request first and not bother with a GET request for the content.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: