URL Decode (convert Percent Encoded names)

Download:

Overview:

This rename preset converts filenames which have been URL Encoded back to their proper names.

URL Encoding, also called Percent Encoding, typically replaces certain characters (usually ones which are illegal to use in a URL) into %xx escape sequences, where xx is the hex value of the characters. This is most often seen with space characters being turned into %20.

For example, this is a URL Encoded string:

A%20Question%3F.txt

If you decode it you get this:

A Question?.txt

Handling of illegal names:

The example above raises another issue: The decoded string may not be a valid filename. In this case, it contains a ? character, which is not allowed by the Windows file naming rules.

Aside: Windows file naming rules.

Microsoft's developer resources define the Windows file naming rules.

Microsoft frequently break links to their web pages, as they are still getting to grips with how this newfangled World Wide Web thing works, so here is a screenshot of the most important parts of the linked page, in case the link stops working:


To handle decoded names which break the rules, the script will add or replace some characters with an underscore to keep them legal.

(This can result in more than one file getting the same name, of course, but the Rename dialog gives you tools for handling that, and it's better than generating names which cannot be created or which are a security risk. That's an inherent issue with URL Encoding anyway, as it allows multiple ways to represent the same string.)

The script also takes care of some potentially dangerous situations. For example, a URL Encoded name could decode into something like C:\Windows\notepad.exe or ..\..\Directory Opus\dopus.exe. The Rename tool allows you to move files around using full paths, which is very useful at times, but it's unlikely you would want that to happen based on a URL Encoded filename.

It's likely that a URL Encoded name containing path characters was never intended to be turned into a Windows file path, such as a string that decodes to a full URL: http://www.example.com. Another possibly is a malicious name from someone trying to trick you into moving a file into a place where it could then be executed or change your system's configuration. The script takes care of both.

Script code:

If you just want to use the script, use the download above. The code is reproduced here to help people browsing the forum for scripting techniques.

The makeNameLegal function makes up the bulk of the script, and is something other scripts may want to re-use. (If you make any improvements to it, please pass them back.)

The actual URL Decoding is handled by a decodeURIComponent which is built in to JScript.

function OnGetNewName(getNewNameData)
{
	try
	{
		var dec = decodeURIComponent(getNewNameData.newname);
		return makeNameLegal(dec);
	}
	catch(e)
	{
		return false;
	}
}

function makeNameLegal(name)
{
	// Replace illegal characters with underscores.
	// " is only there twice to keep syntax highlighting happy.
	name = name.replace(/[\x00-\x1F<>:""/\\|?*]/g, '_');

	// If the name ends in a space or a dot, convert it to an underscore.
	// This also handles "." and ".."
	name = name.replace(/[ .]$/, '_');

	// Add underscore to end of reserved names, including if before an extension.
	var stem = name.replace(/^(.+)\.[^\.]*$/,'$1');
	var ext = (stem === name) ? "" : name.replace(/^.+(\.[^\.]*)$/,'$1');
	switch(stem.toUpperCase())
	{
	default:
		break;
	case "CON":  case "PRN":  case "AUX":  case "NUL":
	case "COM1": case "COM2": case "COM3": case "COM4": case "COM5":
	case "COM6": case "COM7": case "COM8": case "COM9": case "LPT1":
	case "LPT2": case "LPT3": case "LPT4": case "LPT5": case "LPT6":
	case "LPT7": case "LPT8": case "LPT9":
		name = stem + "_" + ext;
		break;
	}

	// No need to check for empty string; returning "" is like false and cancels the rename.
	return name;
}

History:

  • 02/Sep/2017:
    • Initial version.
2 Likes