I'm convinced there is a gold mine sitting right in front of us ready to be picked by someone who can intelligently combine web scraping knowledge with LLMs e.g. scrape data, feed it into LLMs do get insights in an automated fashion. I don't know exactly what the final manifestation looks like but its there and will be super obvious when someone does it.
I feel that the more immediate and impactful opportunity that people are doing is instead of scraping to get/understand content. LLM agents can just interactively navigate websites and perform actions. Parsing/Scraping can be brittle with changes, but an LLM agent to perform an action can just follow steps to search, click on results, and navigate like a human would
Are you aware of any projects for this? I began to build my own but quickly saw that the context window is not large enough to hold the DOM of many websites. I began to strip unnecessary things from the DOM but it became a bit of a slog. L