Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've had to do a lot of scraping recently and something that really helps is https://pypi.org/project/requests-cache/ . It's a drop in replacement for the requests library but it caches all the responses to a sqlite database.

Really helps if you need to tweak your script and you're being rated limited by the sites you're scraping.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: