Intelligent Renaming Using AI

jimerb · February 9, 2024, 4:36pm

I'd like to make a feature request for Opus to study a set of filenames using AI and rename it so it fits a target pattern.

For example, Target Pattern:

**SERIES.yy.mm.dd.Description**

Then a list of messy filenames like this:

Cars.24.02.09.Toyota.jpg
Cars.02.23.24.Honda.jpg
Cars.05 07 2024 GM.jpg
new.24.30.09.Cars.GMC.jpg
new.cars.24.09.31.Honda.jpg
Cars.2024.Feb.7.Honda.jpg

They should get converted to this:

Cars.24.02.09.Toyota.jpg --> Cars.24.02.09.Toyota.jpg
Cars.02.23.24.Honda.jpg --> Cars.24.02.23.Honda.jpg
Cars.05 07 2024 GM.jpg --> Cars.24.05.07.GM.jpg
new.24.30.09.Cars.GMC.jpg --> Cars.24.09.30.GMC.jpg
new.cars.24.10.31.Honda.jpg -->Cars.24.10.31.Honda.jpg
Cars.2024.Feb.7.Honda.jpg --> Cars.24.02.07.Honda.jpg

When I give CHATGPT guidance on how to rename these and then start feeding it filenames, it does a surprisingly good job of understanding what to do. You could use the CHATGPT API keys so that pro CHATGPT users could add their API key to Opus and use it to assist with AI-based renames.

INSTRUCTIONS GIVEN TO CHATGPT

I will give you some filenames and I want you to convert them into a format like this:

SERIES.yy.mm.dd.Description.extension

*Be on the lookout for improper dates in the source filenames. For example if you encounter the number 15 you cannot assume that's a month. When determining the series name, look a the set of filenames as a whole instead of looking at 1 filename at a time. When you have a hard time understanding what is the year, choose the year that is more recent and assume the month is the older number. If you see elements in the filename that seem to not be descriptions of what is contained in the file, remove it. For example the words New or updated. *

The file rename dialog box would come up with text box for instructions guiding ChatGPT on what to do. These can be saved as favorites. You would see a preview like you do now.

Remember that the series names can be quite messy. So An intelligent agent can take the renames further than a pre-canned script.

Hope you will consider this. Could be a major feature.

Leo · February 9, 2024, 4:52pm

You could use the Rename dialog's Clipboard button/menu (under the preview, at the bottom) to get a list of names, and then take the list ChatGPT generates and feed them back into it via the same thing.

Automating things further is probably possible via a rename script that calls ChatGPT, but I've never done looked into that myself.

jimerb · February 9, 2024, 5:23pm

The problem would be that If I get a list back it would be tricky to link it to the source name. I'm sure with an API you could control it with a key.

It also would be much smoother staying within the interface with a list of common (favorite) rename instructions that you could pick from.

I would think that you could just expand the rename box to have a different mode that had a freeform text box. Then you would feed that text along with a list of the filenames to chatgpt. Get a list back with the id and new name and offer it in the preview box. From there, opus could pick up where it left off and do the rename. If the user puts lame instructions in, they will get lame results.

Chat GPT has also just introduced custom GPT's that are remarkably easy to setup. Even a regular user can make them. But companies are making "companion" gpt's that interface with their applications so the heavy lifting is happening in the cloud rather than having to be figured out by the application. So for example, you could have a custom GPT that accepts rename instructions and a list of files and returns a structured list back to the application that you'd expect. This way you don't run the risk of getting junk back from a user who puts bad requests in. You would control the protocol.

You could imagine that if you have ChatGPT doing this, that it is "aware" that the series name is CARS and Honda is a type of car.

Or if it were music you were renaming it would know that Bruce Springsteen is the artist, no matter where it found those words in the filename.

Leo · February 9, 2024, 5:25pm

It should be in the same order it was originally, which lets you apply it over the existing names, all at once.

cyilmaz · February 9, 2024, 5:33pm

As I much I'd love to see something fancy like this, and I use ChatGPT everyday, I think this would bloat DOpus unnecessarily, putting freeform textfields for renaming, and then probably other areas.

This is. however, right in the territory of scripts as Leo said. It is doable today via a custom script dialog, API key and other options maintained in the script settings and whatnot. And calling URLs via scripting is also possible (ex: there is an Imgur uploader somewhere on the forum). And It'd be still entirely transparent to user, since OnGetNewName() works wonders already. The main advantage of scripting would be you'd have maximum freedom, without waiting DOpus main program to be updated. But last decison is up to dev team obviously.

jimerb · February 9, 2024, 5:46pm

cyilmaz

Carefully study the list of files above. Those are easy ones. In reality it's much worse. Trying to make a script that will cover all scenarios is impossible with traditional coding.

It's Certainly way-way beyond a typical opus user.

The beauty here is the app just has to rely on ChatGPT to do the magic you've been experiencing when using it.

It has the uncanny ability to look at the set of filenames "as a whole" and understand the way it should rename like you would do manually.

Leo · February 9, 2024, 5:47pm

We’re not talking about that. We’re saying you can already make a script that talks to ChatGPT.

jimerb · February 9, 2024, 5:51pm

I am the worlds biggest fan of Opus. I use it like crazy. And I will always support you by using the latest version.

But a huge amount of the capability of Opus is "over my head." (And my setup is pretty complex.)

That's why bringing these types of things into the interface allows users who aren't experts in all of the under-the-hood functionality to benefit from them. If you can already make a script to talk to ChatGPT, then please do and let us benefit!

Plus, if you're looking for a "Marketing Headline" in 2024 -- saying that you can do AI renames is bound to draw sales.

cyilmaz · February 9, 2024, 6:02pm

Theoretically with scripting & API one can do whatever can be done via the Web interface silently in the background. One can open a Script dialog with a big text entry field, and one output area. Since we have access to a browser (new ActiveXObject('Msxml2.XMLHTTP.6.0')) in JScript, one could feed any input with as many roundtrips as needed, press "take over to DOpus" button when done, which would in turn be passed to DOpus within OnGetNewName(). But indeed, this is not one's average scripting level, yet still doable.

I might give it a shot some time, unfortunately my Pro/v4 access is from a company team account, so I got the free $5 limit which is probably not enough for a development. I could write it for another website, and let others adjust it to OpenAI API, or maybe one of the many other fine coders in forum would take up the challenge, looks definitely fun!

EDIT: It just occurred to me that I can have ChatGPT write a client for its own API Maybe I wouldn't need to use up all my $5 limit. Unfortunately, I am very pressed on time but here's the basic ES3-compatible script; obviously it needs to be tweaked and extended to a full script:

// WARNING: Do not embed your API keys directly in client-side code in production applications.
// This example is for educational purposes only.

var apiKey = 'YOUR_API_KEY_HERE'; // Place your OpenAI API key here
var endpoint = 'https://api.openai.com/v1/completions';
var data = JSON.stringify({
  model: "text-davinci-003", // Specify the model you are using
  prompt: "Hello, world!", // Your prompt to the model
  temperature: 0.7,
  max_tokens: 150
});

var xhr = new XMLHttpRequest();
xhr.open('POST', endpoint, true);
xhr.setRequestHeader('Content-Type', 'application/json');
xhr.setRequestHeader('Authorization', 'Bearer ' + apiKey);

xhr.onreadystatechange = function() {
  if (xhr.readyState === 4) { // XMLHttpRequest.DONE
    if (xhr.status === 200) {
      console.log(JSON.parse(xhr.responseText));
    } else {
      console.error('Error:', xhr.statusText);
    }
  }
};

xhr.send(data);

In the provided example using the fetch API to call the OpenAI API, the data object contains several fields that define how the API request is structured and what kind of response you are expecting. Here's a breakdown of each field within the data object:

model: This specifies the model you want to use for generating text. OpenAI provides a variety of models with different capabilities, sizes, and costs. For example, "text-davinci-003" refers to a specific version of the Davinci model, which is designed to provide more nuanced and comprehensive responses. It's important to choose the model that best fits your use case and budget.
prompt: The prompt field contains the text you want to send to the model. The model will generate a response based on this input. It can be a question, a statement, or any piece of text to which you want the model to respond. The quality and relevance of the model's response can depend significantly on how the prompt is formulated.
temperature: This parameter controls the randomness of the output. A lower temperature (e.g., 0) makes the model's responses more deterministic and predictable, while a higher temperature (e.g., 1) increases the diversity of the responses, making them more random and creative. The temperature setting allows you to balance between coherence and creativity in the model's responses.
max_tokens: This defines the maximum length of the model's response, measured in tokens. A token can be a word or part of a word, so the actual text length can vary. Setting a limit helps control the size of the output and can also affect the cost of the API call, as pricing is often based on the number of tokens generated.