Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Uh, the author greatly underestimates the headache of filtering out bot traffic. It's bad enough that some of the fancier comment spam bots load javascript, but going through the server logs would be nuts. The "Contact Us" form would show as the most popular page, since it's constantly being assaulted by automated bot-net based attacks.


Truth, bots could really be PITA. Still, with log analysis you can remove (at least some of) them, but with JS you can't add them. Same applies to non-page files.


Removing them from the logs sounds difficult. And adding them to JS-based stats doesn't sound very useful. I don't particularly care whether Google indexes my site at 10am or 11am so long as it gets indexed. And I certainly don't care about the comment spam botnets.


Of course the hour of bot's visit is not important, but it could be important to see if it comes every day or not. Not to mention hits to images and other files.


Google's Webmaster tools show how many pages the Googlebot grabs daily and how long requests take, so you can monitor that and rest at night.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: