EditBOM - Command to add, remove or toggle UTF-8 byte-order-mark

See also:


Overview:

This script add-in provides a new EditBOM command which you can use in buttons and hotkeys.

The EditBOM command can add, remove or toggle the UTF-8 BOM (byte order mark) at the start of all selected files, or of a particular file that you specify on the command line.

Important Notes:

  • The script does not verify that the file contents are UTF-8 text (or even text at all) in any way whatsoever. It simply adds or removes a 3 byte marker (0xEF,0xBB,0xBF) at the start of the file.

    If you run the script on a JPEG, you'll corrupt the JPEG. If you run the script on a (rare) UTF-16 file, you'll get a nonsense file. (The script API provides methods for converting to and from UTF-16 if you need them, but this script does not use those APIs right now.)

    But don't worry too much, as you can just run the script again to undo the change.

  • Files are overwritten in-place. This should usually be fine for what the script does, but it's always possible something will go wrong.

    If the file is important, make a backup first, especially if the file is on a USB drive that is dangling from a frayed USB cable, only attached by the USB connector, out the window of an airplane, on fire, and could disconnect at any moment.

    The script could be expanded to create backups automatically, or to write to temp-files and then rename them into place, but it did not seem necessary given what it does.

  • Maximum file size is 10MB, since working with larger data would be better suited to a more complex script with a progress dialog. (That's possible, but a lot more work and did not seem necessary.)

    You can modify the max size in the script, but do not set it higher than 2147483644 bytes, (2GB - 3 bytes), due to JScript limitations.

Other Notes:

  • The script won't add a BOM if one is already there, nor try to remove one if none exists in the first place.

    So you can use the ADD and REMOVE modes on a bunch of files to get them into a known state without worrying about the states they started in.

  • Successfully processed files will be de-selected, when working on selected files. Files which fail in some way will be left selected.

  • UAC is supported if elevation is needed to overwrite the file.

  • It takes a couple of seconds for the new file size to be reported in the file display. Don't panic if you see it as 0 bytes for a moment.

  • If you turn on the General > Description column, it will say UTF-8 if the file starts with a UTF-8 BOM and (for most other text files) be blank otherwise. (For other types, it will display other data, such as image dimensions and shortcut targets.)


Installation:

Requires Directory Opus Pro 12.9.2 or above.

  • Download: EditBOM.js.txt (5.7 KB)
  • Open Settings > Preferences / Toolbars / Scripts.
  • Drag EditBOM.js.txt to the list of scripts.

After doing that, you can use the EditBOM command in buttons, hotkeys, menu items, and so on.

Pre-Made Buttons:

Here are the three buttons you are most likely to want.

See How to use buttons and scripts from this forum for how to use the comands or .dcf files. You only need to use one or the other.

  1. Add BOMs to all selected files:
    EditBOM ADD Add BOM.dcf (298 Bytes)

  2. Remove BOMs from all selected files:
    EditBOM REMOVE Remove BOM.dcf (311 Bytes)

  3. Toggle BOMs in all selected files:
    EditBOM TOGGLE Toggle BOM.dcf (309 Bytes)


Command Arguments:

FILE (keyword, default)

  • Specify the path to a file to modify. If you do this, the command works on that file and ignores what's selected in the current folder tab.

    If you do not use the FILE argument, the command works on the selected files in the current folder tab.

    Example 1: EditBOM FILE="C:\My Text File.txt" TOGGLE

    Since FILE is the default argument, you do not need to explicitly use its name; it will pick up any arguments which are not recognized as other command-line switches and keywords.

    Example 2: EditBOM "C:\My Text File.txt" TOGGLE

ADD (switch)

  • The command will add UTF-8 byte order marks to files which don't already have them.

    Example 3: EditBOM ADD

REMOVE (switch)

  • The command will remove UTF-8 byte order marks from files which have them.

    Example 4: EditBOM REMOVE

TOGGLE (switch)

  • The command will add or remove UTF-8 byte order marks, reversing what each file currently has. You can use this if you only want to have one button to toggle back and forth.

    Example 5: EditBOM TOGGLE


History:

  • v1.0 (31/Aug/2018): Initial version.

Script Code:

If you just want to use the script, grab the download from the Installation section above.

The script code is reproduced below so people browsing the forum can find scripting techniques without having to download every script:

function OnInit(initData)
{
	initData.name = "EditBOM";
	initData.version = "1.0";
	initData.copyright = "(c) 2018 Leo Davidson";
	initData.url = "https://resource.dopus.com/t/editbom-command-to-add-remove-or-toggle-utf-8-byte-order-mark/29786";
	initData.desc = "Add or remove the UTF-8 byte-order-mark from files.";
	initData.default_enable = true;
	initData.min_version = "12.9.2";

	var cmd = initData.AddCommand();
	cmd.name = "EditBOM";
	cmd.method = "OnEditBOM";
	cmd.desc = "Add or remove the UTF-8 byte-order-mark from files.";
	cmd.label = "EditBOM";
	cmd.template = "FILE,ADD/S,REMOVE/S,TOGGLE/S";
	cmd.hide = false;
	cmd.icon = "edit";
}

// Helper for nicer loops over Opus collections.
function forEach(col, func)
{
	for (var e = new Enumerator(col); !e.atEnd(); e.moveNext())
	{
		if (!func(e.item()))
		{
			break;
		}
	}
}

// Implement the EditBOM command
function OnEditBOM(scriptCmdData)
{
	var cmd = scriptCmdData.func.Command;
	cmd.deselect = false;

	var filePath = scriptCmdData.func.args.file;
	var isAborted = false;

	if (!scriptCmdData.func.args.add
	&&  !scriptCmdData.func.args.remove
	&&  !scriptCmdData.func.args.toggle)
	{
		func.dlg.Request("You must specify ADD, REMOVE or TOGGLE.", "OK", "EditBOM");
		isAborted = true;
	}
	else
	{
		if (filePath)
		{
			if (!ProcessFile(scriptCmdData.func, filePath, false))
			{
				isAborted = true;
			}
		}
		else
		{
			forEach(scriptCmdData.func.sourcetab.selected_files, function(file)
			{
				if (!ProcessFile(scriptCmdData.func, file.RealPath, true))
				{
					isAborted = true;
					return false; // Break forEach loop.
				}
				return true; // Continue forEach loop.
			});
		}
	}

	// Results of Command functions are reversed to normal, due to returning success as default.
	// True here means the command failed, and Opus should not run further commands.
	// False (or no return at all) means the command succeeded, and Opus should continue with any other commands.
	if (isAborted)
		return true;
	return false;
}

function ProcessFile(func, filePath, doDeselect)
{
	if (!DOpus.FSUtil.Exists(filePath))
	{
		return (func.dlg.Request("File does not exist:\n" + filePath, "&Skip|&Abort", "EditBOM") == 1);
	}

	var blobBOM = DOpus.Create.Blob(0xEF,0xBB,0xBF);
	var blobExisting = DOpus.Create.Blob();

	var fileRead = DOpus.FSUtil.OpenFile(filePath, "r", func.sourcetab);

	if (fileRead.error != 0
	||	(fileRead.size.Compare(0) != 0 && fileRead.Read(blobExisting) <= 0))
	{
		return (func.dlg.Request("Could not read file:\n" + filePath, "&Skip|&Abort", "EditBOM") == 1);
	}

	fileRead.Close();

	// ActiveScripting limits the normal integer size to 2147483647 (2GB).
	// The Opus FileSize objects can handle more, but it makes life difficult since we cannot
	// easily pass things to them to compare sizes. Since a 2GB text file is ridiculous, we
	// simply reject such things here.
	//
	//		var maxSize = 2147483647 - 3; // -3 because we need space to add the BOM.
	//
	// In fact, we'll set the maximum much smaller, as the script as-is does not show a progress dialog
	// etc. which you would really want for converting larger files. Adding a progress dialog is
	// totally possible, but then you'll probably also want the ability to skip/abort in the middle
	// of writing a large file, in which case it would make sense to use temp-files instead.
	// (The script as it is simply overwrites the old files, which should be fine for small ones
	// but is less sensible for large amounts of data.)

	var maxSize = 10485760; // 10MB max size, for now.

	if (blobExisting.size.Compare(maxSize) > 0)
	{
		return (func.dlg.Request("File is too large:\n" + filePath, "&Skip|&Abort", "EditBOM") == 1);
	}

	var hasBOM = false;

	if (blobExisting.size.Compare( blobBOM.size ) >= 0
	&&	blobExisting.Compare( blobBOM, 0, 0, blobBOM.size ) == 0)
	{
		hasBOM = true;
	}

	if (( hasBOM && func.args.add)
	||  (!hasBOM && func.args.remove))
	{
		// No-Op. Just de-select it.
		if (doDeselect)
		{
			DeselectFile(func, filePath);
		}
		return true;
	}

	var addMode = false;

	if (func.args.add
	||	(func.args.toggle && !hasBOM))
	{
		addMode = true;
	}

	var errMsg = null;
	var fileWrite = null;

	// Retry Loop.
	while (true)
	{
		if (fileWrite)
		{
			fileWrite.Close();
			fileWrite = null;
		}

		if (errMsg)
		{
			var dlgRes = func.dlg.Request(errMsg, "&Retry|&Skip|&Abort", "EditBOM");
			if (dlgRes == 0)
				return false; // Abort
			if (dlgRes == 2)
				return true; // Skip
			// Else, dlgRes == 1, retry.
		}

		fileWrite = DOpus.FSUtil.OpenFile(filePath, "wa", func.sourcetab);
		if (fileWrite.error != 0)
		{
			errMsg = "Error opening file for writing (" + fileWrite.error + "):\n" + filePath;
			continue;
		}

		var writtenSize;

		if (addMode)
		{
			writtenSize = fileWrite.Write(blobBOM);
			if (blobBOM.size.Compare( writtenSize ) != 0)
			{
				errMsg = "Error writing to file:\n" + filePath;
				continue;
			}
		}

		var offsetBytes = (addMode ? 0 : 3);

		// Skip the write entirely if there's nothing to write.
		// (Faster, but also avoids an issue in Opus 12.9.x.)
		if (blobExisting.size.Compare(offsetBytes) > 0)
		{
			writtenSize = fileWrite.Write(blobExisting, offsetBytes);

			if (blobExisting.size.Compare( writtenSize + offsetBytes ) != 0)
			{
				errMsg = "Error writing to file:\n" + filePath;
				continue;
			}
		}

		if (doDeselect)
		{
			DeselectFile(func, filePath);
		}
		return true; // Success for this file.
	}
}

function DeselectFile(func, filePath)
{
	var cmd = func.Command;
	cmd.ClearFiles(); // Just in case.
	cmd.AddFile(filePath);
	cmd.RunCommand("Select DESELECT FROMSCRIPT");
	cmd.ClearFiles();
}
3 Likes