Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

E.g. Can you completely parse HTML with regex?


You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML.

etc. https://stackoverflow.com/a/1732454


If by "parse" you mean "match", the answer is yes because you can express a context-free language in PCRE.

If you mean "parse" then it's probably annoying, as all parser generators are, because they're bad at error messages when something has invalid syntax.


Is this true, in practice, given the lenient parsing requirements of the real world?


Technically, no

Practically, yes




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: