Regex - library order title fix

penguinaka · July 28, 2011, 1:49am

can a regular expression be written to recognize if its an article or not? (a, an, the) and do the swap and if not..then don't do the swap (or are those not possible)?

In the instance of... for example a title ending like:

George RR Martin - Ice 01 - Five Foxes, A Fox Story.epub <--ignored

or other type sentences with commas

penguinaka · July 28, 2011, 1:52am

library order swaps are only ones that end with:

, a
, an
, the

it might be to complicated though because i can see in the above example there is an article after the comma although its not of the type thats suppose to be swapped

MrC · July 28, 2011, 2:50am

Here's the rub. You have to be able to define precisely, exactly, what is the pattern you want to deal with. Computers don't know "articles", or the entire list of what you consider an article. Computers don't know "library order" - that is a less than well defined set of rules you've learned over time. If you cannot precisely list this complete set of rules, it would be impossible to realize the results you want. So you have to specify the entire list of rules. The less precise you are, the less accurate your results.

Is an article only one sequence of letter characters?
Can it ever be more than one word?
Is its first letter always upper case?
Are there other characters that have to be avoided, or considered?

So, while I might be able to write an expression, what is the point of it is so complicated you cannot modify or understand it?

Again, I'll stress, having looked through your script, and as others have eluded, you're approach to this is faulty. You may recall I mentioned in another post how we could reduce about 650 lines down to 1 line with less than 40 characters. I also found you could eliminate 90% of the script easily, and it would be more productive for you in the long run.... because you would understand it.

Regular expressions are amazing. They have their strengths and weaknesses. One of their weaknesses is that they cannot read your mind.

michaelkenward · July 28, 2011, 8:31am

[quote="MrC"]Here's a trick that might help.

Build in pieces...[/quote]

Can someone please put this wonderful message somewhere that people will find it in future?

Anyone who has grappled with regex will be immensely grateful to the people who come up here with instant solutions to problems. But without any explanations it is hard to learn what is going on from individual examples.

This message provides a great way of learning and of trying out ideas. It should be cast in stone and put where people can find it easily. And with a subject that will make it easier to find with a search of the forum's archive.

Leo · July 28, 2011, 8:42am

That's what the Rename Scripting forum (and Tutorials forum for non-renaming stuff) is for. Anyone can post there.

penguinaka · July 28, 2011, 2:59pm

I agree the explanations have helped out tremendously in the learning process thanks again for the help!

MrC · July 28, 2011, 6:07pm

Perhaps you might create a link in one of your other threads in the Rename forum to this thread so others may find it.

Leo · July 28, 2011, 6:14pm

I don't think the whole thread is useful, just one or two posts within it.

You can add them (or links to them if you prefer) to new threads (or an existing thread if you prefer) yourself, and then you own the post and can add to or edit it if needed.

penguinaka · July 29, 2011, 6:16am

The solution is:

(?i)(- *)([^-\n]*), *(The|An?)((?: *\([^)\n]*\))?\.)
\1\3 \2\4

thanks for all the help earlier...

Cheers!

penguinaka · July 29, 2011, 7:49am

actually the one prior won't work in opus

this one will and it takes into account multiple rounded brackets

(.* - )([^-\n]*), *(The|An?)((?: *\([^)\n]*\))*?\.)(\w+) \1\3 \2\4\5\6

MrC · July 29, 2011, 7:21pm

When will you ever have newlines in the file names?

penguinaka · July 29, 2011, 7:39pm

explain...not sure what your getting at?

MrC · July 29, 2011, 8:10pm

The red items are newline escapes.

(.* - )([^-\n]*), (The|An?)((?: ([^)\n]))?.)(\w+)

I don't believe you'll ever see these in file names. You can remove those from your RE.

penguinaka · July 29, 2011, 11:31pm

[quote="MrC"]The red items are newline escapes.

(.* - )([^-\n]*), (The|An?)((?: ([^)\n]))?.)(\w+)

I don't believe you'll ever see these in file names. You can remove those from your RE.[/quote]

ah i see...i'm a copy paste tweek kinda guy right now with this stuff...so i just kept throwing different pieces in till it worked lol

thanks

krkaem · September 11, 2011, 1:45pm

Is there a way to address more than 9 indices ?
I tried to give new names via
\5\6\7\8\3\4\1\2\9\10\11
\9 was accepted, but the higher numbers not . They were interpreted as 0001.

Leo · September 11, 2011, 2:01pm

I may be wrong but not as far as I know.

If you are doing something that complex, I'd recommend using the rename script mode to split it up into multiple steps using VBScript / Perl / Javascript. There's an example in the Rename Scripting forum.