Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Don't you think people would have figured it out by themselves the moment programmers started writing parsers? I'm not sure his contribution was particularly needed.




Lots of things get invented / discovered multiple times when it's in the air. But just because Newton (or Leibnitz) existed, doesn't mean Leibnitz (or Newton) were any less visionary.

For your very specific question: have a look at the sorry state of what's called 'regular expressions' many programming languages and libraries to see what programmers left loose can do. (Most of these 'regular expressions' add things like back-references etc that make matching their franken-'xpressions take exponential time in the worst case; but they neglect to put in stuff like intersection or complement of expressions, which are matchable in linear time.


Just checked after reading your comment. Surprisingly to me, AFAs (Alternating Finite Automatons) do let you introduce the Complement operation into Regex while preserving the O(mn) complexity of running NFAs.

That's really subtle, because deciding Regex universality (i.e. whether a regex accepts every input) is PSPACE-COMPLETE. And since NFAs make it efficient to decide whether a regex matches NO inputs, any attempts to combine NFAs with regex Complement would trip on a massive landmine.


Actually, I've now checked more thoroughly, and RE+complement matching is O(n^2 m), which is not that good.

See https://en.wikipedia.org/wiki/Regular_language#Closure_prope...

The complement of a regular language is a regular language, and for any given regular language we can check whether a string is a member of that language in O(length of the string) time.

Yes, depending on how you represent your regular language, the complement operator might not work play nicely with that representation. But eg it's fairly trivial for finite state machines or when matching via Brzozowski derivatives. See https://en.wikipedia.org/wiki/Brzozowski_derivative

See also https://github.com/google/redgrep

Regular languages have a bunch of closure properties. In practice, intersection and complement are really useful. So you can say things like:

[regular_expression_for_matching_email_addresses] & ![my_list_of_naughty_words]

Exercise for the reader: expand the expression above to allow the town of 'Scunthorpe'.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: