I think it's a great idea... if you already know regex. Effectively it's just a ...

hansjorg · on Nov 25, 2020

Rust version of the same library:

https://github.com/VerbalExpressions/RustVerbalExpressions

Implementations for 36 different languages:

http://verbalexpressions.github.io/

dehrmann · on Nov 25, 2020

> I think it's a great idea... if you already know regex

It's actually a bad idea in this case because regex is mostly the same in every modern language, so if you know it, you know it everywhere. What you don't know is this.

I agree with the common complaint that regex is effectively write-only, but this is only half due to its terse syntax. A pattern can be pretty complex on its own, and complex things are hard to understand. Imagine what code matching behavior of a complex regex would look like.

simias · on Nov 25, 2020

> It's actually a bad idea in this case because regex is mostly the same in every modern language, so if you know it, you know it everywhere. What you don't know is this.

I disagree, at least in my experience there are significant differences between multiple regex engines I'm used to use regularly. In no particular order: are parens and other operators treated literally by default or do they need to be escaped? Are character class like '[:alpha:]' understood, or do I need to write them explicitly? Similarly, do I have access to \w \W \s and friends? Can I use + to mean {1,} ? Can I use '?' to match 0 or 1 (common) or do I have to use = (vim)? Or maybe just {0,1}? But then should I escape the braces? Do I have recursion? Do I have named captures?

Those are not theoretical concerns, that's stuff I routinely end up getting wrong because I forget that this one feature that works in pcre does not work in vim or works differently in sed etc...

dehrmann · on Nov 25, 2020

> are parens and other operators treated literally by default or do they need to be escaped?

> Can I use + to mean {1,} ? Can I use '?' to match 0 or 1 (common) or do I have to use = (vim)? Or maybe just {0,1}? But then should I escape the braces?

I think that's just older tools like vi and sed. Perl, Python, Java, and Javascript use a similar modern version where + and ? work, and parentheses and braces don't need to be escaped.

lucb1e · on Nov 25, 2020

> if you know it, you know it everywhere. What you don't know is this.

Right, one language might have anythingBut(" ").endofline() and the next language might have a different . operator like anythingBut(" ")->endofline() or it might even require nesting calls. None of these things are a significant hurdle and if we standardize the names (endofline, anythingBut, ...) then you can make the same argument. It's a chicken and egg argument: just use regex because that works everywhere -> it's not universally implemented -> it won't work everywhere.

And aside from that, I have a similar experience to the sibling comment: when using some command line tool that I forgot (is it sed? Vim?) the default is that \( is a capture group whereas in normal regex ( is a capture group. Grep offers you three regex variants to choose from. I have to look up regex syntax or do trial and error every time I don't use a language that I use daily. And I don't know all of regex to begin with, I just know everything I ever needed but people posted examples here with (?:x) which I don't know. I once read it and remembered it for a few days I think... so anyway, consistent and descriptive method names seems a lot easier especially when you consider autocompleting IDEs.