Regular expression modes & syntax

Leo · June 13, 2011, 7:37am

I'm open to discussion but so far I don't find any of the arguments compelling. If I disagree with something it's not because I'm close-minded. Far from it. It's because you haven't convinced me.

You can't just assert that one method is easier than the other; you need to back it up with evidence. The evidence you've given so far has backfired, at least in my eyes.

In the other thread you jumped on my regexp not doing something as an example of how the current regexp system was over-complicated. It was only an example of me, in my haste, not considering a case that was implied but not explicitly mentioned in the question. You gave your version that didn't work either, and then fixed it with something that was far more complex than the simple amendment required to make my original regexp handle the extra case. My example worked using the basic regexp building blocks while yours required people to use and understand an advanced, additional concept and the syntax for it. If we're really talking about simplifying things for the average user then any argument involving the phrase negative lookbehind assertions is not helping your case. That gave me the impression you were asking for what you're used to, and trying to find evidence to back it up whether or not the evidence was really there, rather than looking at the situation objectively.

There's nothing wrong with asking for something because you are used to it, of course. We don't set out to create a program that people find alien. It's just less likely to gain traction unless lots of other people also ask for the same thing; enough people to justify either changing things or adding complexity (to development, testing and end-users) by having two modes for the same thing. And remember that the way it works now is what a lot of other people are used to (not just from Opus but from other programs), if they are used to regexps at all.

I'm also not sure why we have conflated the syntax issue (wanting to be able to use /find/replace/) with the issue of whether or not the expression has to match the whole name. While, so far, I am not personally convinced there is value in changing either of them, they are independent and one could change without the other.

SED-style syntax:

I disagree with the assertion that one method is much easier than the other. They are both about the same as each other, and in some ways the SED-style method you're advocating is more complicated, not less. Even in the situations where it is less complicated, it isn't a lot less complicated. I've yet to see an example that justifies adding complexity to the UI by having two methods for the same thing and adding confusion to regexp learners. Especially the people who feel intimidated by what looks, at first, like a load of random symbols; they aren't helped by joining the find & replace parts by yet more symbols (which also make it hard to see where the separation is, if you even know the expression is meant to have two parts to begin with). If you want to deny that those people exist then you haven't read these forums for very long. We've taught a bunch of people regexps over the years, and also had a bunch who were too intimidated to even try to learn them.

If we can make them less intimidating, great. Or if we can make them significantly more powerful, also great. But all SED-style syntax seems to do is let you push the slash key (three times) instead of the tab key (once), while making the result harder to parse and more error-prone (e.g. if you get the slashes wrong or don't escape things properly parts of your Find and Replace expressions end up in the other expression).

Matching the whole name:

My argument against changing this boils down to similar themes. It doesn't make things much easier. People who understand regexps can easily add (.*) at the start and end when needed. People who are learning regexps aren't going to be helped by adding complexity, in the form of an additional mode/checkbox that has to be set the right way, to the UI and examples. It also makes that one mode work contrary to how filename wildcards usually work, and makes it easy for people to create an expression that seems to work on some files, and then ends up dropping half the filename on others.

IMO, it is a benefit not a deficiency that regexp renaming skips filenames that the 'find' expression does not match entirely. Explicitly typing (.*) every so often is a small price to pay to avoids unexpected mishaps that mangle filenames in a batch rename.