That fundamentally misunderstands the problem in multiple ways: \* this is still...

		o11c 5 months ago \| parent \| context \| favorite \| on: Diffsitter – A Tree-sitter based AST difftool to g... That fundamentally misunderstands the problem in multiple ways: * this is still during lexing, not yet to parsing * there are multiple valid token sequences that vary only with a single character at the start of the file. This is very common with Python multi-line strings in particular, since they are widely used as docstrings.

One could fold lexing into the parsing and do error cost minimization on both.