Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

IIRC the IA no longer cares about robots.txt after it kept getting abused [1] to take down older pages. You can still request to take down pages, but it needs a form and a reason. [2]

(Remember, robots.txt is not a privacy measure, it's supposed to be something that prevents crawlers from getting stuck in tar pits!)

[1] https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea...

[2] https://help.archive.org/help/how-do-i-request-to-remove-som...



Useful to know. My more general position, which apparently is not much shared here, is that removing one's site from the internet has historically meant that the site stops being accessible, stops being indexed, and stops being findable with a simple search. If, going forward, we're going to revise that norm, IMO it would be polite at least to respect it retroactively.


That seems in conflict with the idea that once something's been released, it can't ever truly be unreleased.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: