Swap Title, Series, Author.. Leave Version Info

Hi thanks for the help with the previous problem. Here is another one that i'm working on. I'm working with hundreds of ebook files and they have to be named in particular sequence in order for them to import properly into calibre (an ebook library management tool). I have found it easier to first prepare the files pre-import rather than sort and separate & change the import script with each group.

The correct sequence i'm looking for is:

George RR Martin - Game of Thrones.epub
Martin, George RR - Game of Thrones.epub

Often the title and author are backwards.

Game of Thrones - George RR Martin.epub
Game of Thrones - Martin, George RR.epub

I have created a button and i use this script which works:

Rename REGEXP PATTERN="(.) - (.)(..*)" TO="\2 - \1\3" AUTORENAME
@nodeselect

MY PROBLEM: often some ebooks have version information which i want to remain in place...

Game of Thrones - George RR Martin (v1.0).epub
or
Game of Thrones - Martin, George RR (v5.01).epub

When I use this script it does this:

George RR Martin (v1.0) - Game of Thrones.epub
or
Martin, George RR (v5.1) - Game of Thrones.epub

When this is the actual result I want.

George RR Martin - Game of Thrones (v1.0).epub
or
Martin, George RR - Game of Thrones (v5.1).epub

and sometimes the ebooks have series information that needs to stay in place:

which this script moves the version info also which i dont want:
(.) - (.) - (.)(..)
\3 - \1 - \2\4

looking for
From:
Game of Thrones - Ice & Fire 01 - George RR Martin (v1.01).epub
To---:
George RR Martin - Ice & Fire 01 - Game of Thrones (v5.0).epub

any help would be appreciated:

Here's what I came up with. Looks more complex than it is because it's dealing with brackets. :slight_smile:

Old name: (.) - ([^(])( (.))?(.[^(.])
New name: \2 - \1\3\4

Let's break it down:

[ul]li - Same as before. Anything up to a dash, to capture the book title.

[/li]
[li]([^(]) Captures the author name, like before, but instead of matching anything it matches anything until the first (.

[/li]
li? This is easier to read if we replace the litteral brackets ( and ) with a and b:

(a.b)? That matches anything surrounded by a and b, and the ? on the end means it's optional. Put the brackets back in and, of course, this is matching stuff like (v1.0) in the names, and since the ? is there it's also happy if that stuff isn't there at all.

[/li]
[li](.[^(.]
) Finally, this is matching the extension, like you were doing, with the only change that it ensures there is no ( or . in the extension. That way you ensure that ".epub" is the extension, not ".0).epub". This doesn't seem to be needed -- it seems to work okay if you use (..
) for the extension instead -- but I like to be careful as it avoids surprises when you feed in lots of other filenames that may have unexpected formats. :slight_smile: [/li][/ul]


Your my new hero it works like a charm. I've been going bonkers over this for weeks.

I tried it out on a a file name with an extra field (series) and I'm not quite clever enough to modify it to handle this situation as well:

Sometimes the ebooks have series information that needs to stay in place:

looking for=>

From:
Game of Thrones - Ice & Fire 01 - George RR Martin (v1.01).epub
To---:
George RR Martin - Ice & Fire 01 - Game of Thrones (v5.0).epub

do you have something for this as well or a script that can handle both the above and this?

You want the series in the middle, not kept after the title?

i.e. you want this:

George RR Martin - Ice & Fire 01 - Game of Thrones (v5.0).epub

not this:

George RR Martin - Game of Thrones - Ice & Fire 01 (v5.0).epub

?

[quote="leo"]You want the series in the middle, not kept after the title?

i.e. you want this:

George RR Martin - Ice & Fire 01 - Game of Thrones (v5.0).epub

not this:

George RR Martin - Game of Thrones - Ice & Fire 01 (v5.0).epub

?[/quote]

the way i import book they need to be in the following formats:

author - title
author - title (version)
author - series # - title
author - series # - title (version

sometimes they are like this that need to be fixed:

title - author
title - author (version)
title -series # - author (version)
title -series # - author (version)

so basically the author and titles are being swapped and the version and series info remain in the same place.

I also noticed in you screen shot that you have a command called "dots to spaces except in number"
when i'm cleaning up my books i remove all dots and then individually got back and replace dots in the different type of versions.
I do it in like 25 steps because i had to take into account all the different ways the version would appear. Would that command be
applicable to what i'm working with? If so might i get a copy?

An example: George.R.R.Martin.-.Game.of.thrones.(v1.0).epub

Dots to spaces except in number is here:

Haven't had time to make the regexp handle the series #; maybe someone else reading can step in.

So using Leo's break-down of each of the elements of the regex... you basically just want to apply the same sort of conditional regex he showed for catching the version info (using the ? char at the end of that particular pattern in the expression) to catch it if it is present in the name, but not require it to be there in order for the rest of the regex to be applicable.

So, for a sampling of filenames using different combinations of what you've shared so far... like:

Game of Thrones - George RR Martin.epub
Game of Thrones - George RR Martin (v1.01).epub
Game of Thrones - Ice & Fire 01 - George RR Martin.epub
Game of Thrones - Ice & Fire 01 - George RR Martin (v1.01).epub

This worked:

Old name: ([^-]) - ([^-] - )?([^(])( (.))?(.[^(.]*)
New name: \3 - \2\1\4\5

Expanding on Leo's very thorough breakdown:

[ul][li]Title: \1 == b[/b] : A little different from before, with the "possibility" of having multiple series of "-" chars in the name along with this first element still being the book title. This first element should now be made to match specifically up to the first occurrence of "-"... otherwise, you could end up with the previous and less specific version (of b[/b]) matching more of the filename than you intend.[/li]
[li]Series #: \2 == ([^-]* - )? : This is a new pattern to catch the possibility of there being a second string with a trailing " - " at the end of it representing the "Series" before the Title appears... it's using the same conditional ? char that Leo used to catch your possibility of the Version string being there without requiring it to be there, but also includes in the brackets the actual trailing " - " occurrence. If you put that " - " outside of the pattern matched "conditionally" inside the brackets, then whenever you had a name that did NOT include that second "-" char, the results would be skewed and it wouldn't work properly.[/li]
[li]Author: \3 == ([^(]*) : Same as before - just in a different position, it captures the author name, like before, but instead of matching anything it matches anything until the first (.[/li]
[li]Version: \4 == b?[/b] : Same as before, this is easier to read if we replace the litteral brackets ( and ) with a and b:
(a.b)? That matches anything surrounded by a and b, and the ? on the end means it's optional. Put the brackets back in and, of course, this is matching stuff like (v1.0) in the names, and since the ? is there it's also happy if that stuff isn't there at all.[/li]
[li]Extension: \5 == [b](.[^(.]
)[/b] : Finally, (same as before) this is matching the extension, like you were doing, with the only change that it ensures there is no ( or . in the extension. That way you ensure that ".epub" is the extension, not ".0).epub". This doesn't seem to be needed -- it seems to work okay if you use (..*) for the extension instead -- but I like to be careful as it avoids surprises when you feed in lots of other filenames that may have unexpected formats.[/li][/ul]


I love your guys explanations... i'm still a noob when it comes to regex but with the clear and concise explanations of the steps in your procedure it has shed a whole new light of understanding on it for me. Thank you for that and helping me with the problem.