Power Renaming Files

Hey folks ... This is part Dopus, part regex ...

Here is the scenario. You have dozens of files named COMPLETELY arbitrary (don't ask - this is for science :slight_smile: ...

  1. Model.jpg
  2. Training.jpg
  3. Bus.jpg
  4. Family.jpg
  5. Dog.jpg
  6. Ferret.jpg
    ...

Let's chalk it up to someone that doesn't know about database indexing and numbered their pictures in the filename :slight_smile:

Now, what would be desired is to RENAME these guys to take out the goofy numbering and just get back to the filename. So, we'd end up with:

Model.jpg
Training.jpg
Bus.jpg
Family.jpg
Dog.jpg
Ferret.jpg
...

Now, I've used several utilities in the past (namely Magic File Renamer) that would solve this right quick - TRIM LEFT 4 basically. I could even write a little program to do it - but, since I have a Dopus logo tattooed on my arse (ok, that's not true - but I'm one of the big cheerleaders for the product) I should be able to get Dopus' advanced rename tool to do this.

So I set to work at it. I know enough about regex to get myself into trouble, so that was an option too.

In the old Amiga days, we could use ? for character flags - so to do this would be:

Old Filename: ????#.#
NewFilename: #.#

... that would essentially strip of the first 4 chars and give us the desired results.

Now with regex, I can select the files in question using regex, but I cannot seem to reserve the rest of the filename - I could RENAME them to IMAGE [#].JPG, but I'd lose the descriptive filename.

So - I'm at a loss here; I'm not sure I can accomplish this task with Dopus. I don't like it when I can't make Dopus do what I want :slight_smile:

Anyway, any help or direction you guys might have - I'd appreciate it.


:smiley:

See, I knew Dopus could handle that .. THANK YOU!

Now, let's see what I've learned ...

I understand the period matches ANY character so that takes care of the first four - then the .+ indicates 'match any number of characters' - so anything after the first four.

I think its the grouping that threw me off before.

Does (....) = \1 &
(.+) = \2 ?

So I can select PARTS of filenames to group together? Using the parameters?

This is very cool - thanks for this.

Here is another RegExp way to skin the same cat and this one has a bit more flexibility in that if your files are mixed in with other files that should not be changed, you can select all the files anyway and this will only modify the file names that qualify.

Old name:
^([0-9]+). (.*)

New Name:
\2

Now how it works. Your existing file name pattern begins with a series of digits, followed by a period, followed by a space, followed by the rest of the file name. The above RegExp is based upon that pattern being true for all the files you wish to rename.

^([0-9]+)
Tells DOpus RegExp that starting at the very beginning of each selected file name, look for a series of digits. No alphabet characters, no punctuation, no spaces, only digits.

.
Tells DOpus that the above found digits must be followed by a period. In RegExp a period normally has a special meaning so when you want to find an actual period you should escape it with a \

Also notice that right after . is a space, that is because your examples have a space between the first period and the rest of the file name.

(.*)
Means anything. In other words, it matches anything which follows the space that follows the first period.

The () in the above (.*) make the match a tagged match. A tagged match can be reused as part of the new name. From left to right in the old name box the first (tagged match) is represented by \1 the second (tagged match) by \2 and so on. Because I used two sets of () in the old name, to only use the second tagged match in the new name I used \2 just as Nudel did.

John

Another 'old name' pattern:

(.) (.)

This will strip everything up to and including the first space. Also, it has the advantage of looking "slightly" rude :slight_smile:

It's probably worth clarifying the difference between * and +

. means match any character exactly once.

* means match the previous character zero or more times.

Combine the two and you get:

.* means match zero or more characters

In contrast, + means match the previous character one or more times, giving:

.+ means match one or more characters

Some examples:

1234 will fit b(.*)[/b] but not b(.+)[/b]

12345, 123456, abcdefg, aaaaaaaa will all match both b(.*)[/b] and b(.+)[/b]

Also, John's pattern which makes sure the filenames start with four digits, instead of just four arbitrary characters like my one, is better than mine if you want to be safe from renaming other files by mistake.

(Files which don't match the input pattern will just be skipped, but also left selected, so you can apply a different pattern to any that don't fit, until none remain, which can be useful!)

I've got another renaming question: is it possible to do something like this with a single regular expression command?

I've got a filename called [YA1515]File_Name[1T4C16A].zip and what I want to do is a) convert _ to space and b) remove the brackets at the start and end.

At the moment I'm using two commands to do this. The first one is a "Find And Replace" for _ -> space. The second one is for removing the brackets and it goes like this:
"Regular Expressions" ([.+]) (.+) ([.+])(..+) -> \2\4

After these two I end up with File Name.zip.

Now, can I combine these into one command somehow?

Just put both onto the same button and it'll work fine.

rename PATTERN="_" TO=" " FINDREP rename PATTERN="(\[.+]) (.+) (\[.+])(\..+)" TO="\2\4" REGEXP

Going back to the original problem, how would you modify John's regex of

Old name:
^([0-9]+). (.*)

New Name:
\2

so that it will only match 3 digits at the beginning ?

Also, what if you only wanted to match between 3 and 5 digits at the beginning ?

RegExp does not give you a way (that I know of) to specify a minimum and maximum match value. Therefore you need to make multiple passes with it to remove in this order: 5 then 4 then 3 leading digits.

To specify one specific digit you simply remove the + from the following

[0-9]+

As the + means one or more occurrences of the single thing that precedes it. Because 0-9 is enclosed within square brackets, everything between those brackets is considered to be "one thing" (in this case a range of digits from 0 to 9). So exactly three digits at the beginning of a file name would be represented by

^[0-9][0-9][0-9]

At the end of this message is a button that should do what you want. Note that as requested in the original message of this thread, all the leading digits must be followed by a period and a space for this to work. So the following file names:

  1. Cherries.txt
  2. Raisins.txt
  3. Apples.jpg
  4. Carrots.jpg
  5. Peaches.otl
  6. Pears.bak
  7. Bananas.html

after renaming would become:

  1. Cherries.txt
  2. Raisins.txt
  3. Pears.bak
  4. Bananas.html
    Apples.jpg
    Carrots.jpg
    Peaches.otl

A side note. Interesting enough when doing a multiple pass search and replace with RegExp it does not appear to be necessary to repeatedly reselect the files in between rename passes. Preliminary tests seem to reveal that we can do 1 "good old fashioned rename" followed by multiple RegExp renaming without having to reselect the files.

John

[DOpus.ButtonInfo]
Name=Strip 3 to 5 leading digits
Icon=84,9999999
Flags=6,0,0
Color=0,a0a0a0
Tooltip=Digits must be followed by a . and a space
Func1=Rename PATTERN "^([0-9][0-9][0-9][0-9][0-9]). (.)" TO "\2" REGEXP
Func2=Rename PATTERN "^([0-9][0-9][0-9][0-9]). (.
)" TO "\2" REGEXP
Func3=Rename PATTERN "^([0-9][0-9][0-9]). (.*)" TO "\2" REGEXP

Thanks for the explanation John.

[quote]
JohnZeman wrote:
RegExp does not give you a way (that I know of) to specify a minimum and maximum match value. [/quote]

Sure it does. e.g. [0123456789]{1,3}

That will match any one to three digit string. It should match

4
13
146

etc..

I don't think Opus supports the [0123456789]{1,3} syntax; at least, it doesn't seem to work for me.

One problem with RegExps is there's about 50 flavours of them, all slightly different with little extras people have added.

I think Opus uses the popular Henry Spencer regexp implementation. There's a table listing the syntax here (no {} feature):

gpsoft.com.au/manual/WebHelp ... Syntax.htm

I agree. And Ray's suggestion won't work for me either although I expect it works perfectly fine in whatever application he uses which supports that particular RegExp.

It can be a bit of a headache trying to remember the RegExp syntax for each program I use (that supports RegExp). Besides DOpus I use regular expressions in three other programs and each one has its own syntax. I finally had to create a little cheat sheet for myself to remind me which syntax works in which program.

John