MSDos type function env-var desperation

Good morning! o)

I'm struggling to get this simple msdos type button to work as expected.

@externalonly 
@leavedoswindowopen 
//@nofilenamequoting 
//@noexpandenv 

set TESSDATA_PREFIX=D:\bin\gfx\Tesseract-OCR
"%TESSDATA_PREFIX%\tesseract.exe" "<path>" blabla
echo "%TESSDATA_PREFIX%\tesseract.exe"

The batch file DO creates looks like this and it is obviously faulty in certain spots.

..
set TESSDATA_PREFIX=D:\bin\gfx\Tesseract-OCR
%%TESSDATA_PREFIX%%\tesseract.exe "<path>" blabla
echo "%TESSDATA_PREFIX%\tesseract.exe"

Why are there two % signs around the env TESSDATA_PREFIX I set?
And why is it only at the first occurence of this env var and not in the line after?
Where are the dquotes gone I put around the path to the executable?

I tried different modifiers to get the result I want, but it does not seem to help, they bring up other problems at least. This is meant for a context menu entry in a filetype group. I put nearly an hour of try and error into this and give up now, plz enlighten me.

Thank you! o)

ps:
Who need's to ocr text from an image anyway you might ask, well I got a screenshot of a command editor containing code which got lost because DO crashed while customizing (no crashdump). We are not good friends the last days, DO and me, but it is going to be better again, I'm sure. o)

(Note: None of this applies to normal buttons; only DOS Batch type buttons.)

% symbols have to be doubled-up in batch files to stop them being interpreted as arguments to the batch file.

Unfortunatley, % is also a valid character in file paths, and is also used for expanding env-vars, and MS-DOS was a ridiculous mess which haunts us to this day. :slight_smile:

When Opus writes the batch file it automatically expands env-vars and doubles % characters for program paths (the first quoted thing on each line), but not for arguments. (That won't work with env-vars which are set in the batch file itself, however, as in your case.)

We used to double % characters in arguments as well, but stopped doing that for some reason, which we're looking into. It may be because of exactly what you're seeing, which would normally only affect arguments as program paths rarely use self-set env-vars. (OTOH, it looks like the change to not double % characters in arguments has broken commands like notepad.exe {filepath} if run in a DOS Batch type buttons when run against file paths with % in their names.)

Sometimes doubling % in .bat files fixes things; sometimes it breaks things. (The real fix is to avoid .bat files entirely, to be honest. The whole thing is a mess and poorly designed even for something made in the 70s. ^ characters also introduce more complications.)

For what you're doing there, if you want to avoid repeating the same path in lots of places, using an alias and {alias|xyz} may be a good alternative.

n.b. Looking into this some more, cmd.exe itself has similar problems if you try to set and use an env-var in commands sent to it via the /K or /C arguments. If has a "delayed expansion" mode to help there, but that just makes things even more complicated and introduces problems with ! characters instead.

^ and & characters can also cause problems, and the way to escape ^ % & ! characters (all of which are valid in file paths) as well as whether they need escaping varies by context. Sometimes there actually isn't a way to escape them.

All the stuff left over from MS-DOS is still quite a mess in Windows. My advice is to avoid DOS-Batch as much as possible. In terms of Opus, using normal (non-DOS-Batch) buttons and aliases, or script buttons if needed, is where I would head.

Thanks, I eventually remembered the dos command "setlocal enabledelayedexpansion", which enables the use of "!" to expand environment variables. It's not about avoiding repeating a path here, I just cannot leave out setting the environment variable since tesseract.exe requires it to run. Unfortunately, the "!" workaround still does not help against the dquotes getting removed. So I tricked DO another time and also put the path to the executable in another environment variable. The whole hack looks like this:

REM TESSDATA_PREFIX is required for tesseract to find it's folder
set TESSDATA_PREFIX=D:\bin\gfx\Tesseract-OCR
REM putting the executable path in another var to prevent DO from stripping the dquotes
set RUNTHIS="!TESSDATA_PREFIX!\tesseract.exe"
!RUNTHIS! "<path>" blabla
echo 1 !RUNTHIS!
echo 2 "%TESSDATA_PREFIX%\tesseract.exe"

Output:

Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
1 "D:\bin\gfx\Tesseract-OCR\tesseract.exe"
2 "D:\bin\gfx\Tesseract-OCR\tesseract.exe"

A batchfile itself has it's weirdness, for sure, but what DO does here does not really seem to help. Maybe throw in another modifier. Some kind of "@donottouchthebatch", so we can rely on the things we know should work and without any magic mixing in which is not always required.
Sigh! Thanks for explaining nonetheless. o)

Seems I missed your last post writing mine.

Yes, of course. But the batch and oldschool buttons of DO with the handy auto-resolving and |noext modifiers make it really quick and easy to accomplish certain things. Look this script I did to just merge selected pdf files with pdftk.exe and also merge txt files with the same basename.
What a beast for such a simple task if done in a script which does not offer {allfilepath}. New diffculties come up here as well, so whenever the older buttons do the job, I'd have a try with them first. Looking forward to your anouncement to run powershell directly from a button, not sure what kind of placeholders will be supported if at all, but that could actually fill a gap. Not that I am a powershell fanboy, no way, but you can have some bright moments with it. o)

function OnClick(data) {
	var cmd = data.func.command, dlg = DOpus.Dlg, tab = data.func.sourcetab, selFiles = tab.selected_files;
	cmd.deselect = false; cmd.SetModifier('runmode', 'hide'); var shell = new ActiveXObject("WScript.Shell");
	if (selFiles.count<2) return;
	
	var input = dlg.GetString("Enter new *basename* for merged PDF file (will be overwritten if exists):", selFiles(0).name_stem+"_merged", 255,
	"Ok|Cancel", "Merge PDF", data.func.sourcetab.lister);
	if (!input) return;

	var path = (selFiles(0).path+"").replace('(\/|\\\\)$', "");
	var pdftkPath = DOpus.FSUtil.Resolve('/pdftoolkit\\bin\\pdftk.exe');
	var pdfFilePaths = PrepItems(selFiles, "realpath").join(" ");
	cmd.RunCommand('"'+pdftkPath+'" '+pdfFilePaths+' cat output "'+path+'\\'+input+'.pdf"');

	var mergeTextFiles = true;
	var textFilePaths = PrepItems(selFiles, "realpath", "", "", false);
	
	for(var i=0;i<textFilePaths.length;i++){
		textFilePaths[i] = textFilePaths[i].replace(new RegExp("\.pdf$","i"),".txt");
		if (!DOpus.FSUtil.Exists(textFilePaths[i])) { mergeTextFiles = false; break;}
		textFilePaths[i] = "'"+textFilePaths[i]+"'";
		DOpus.Output(textFilePaths[i]);
	}

	var cmdLine = "powershell.exe -noexit -command \"&{$f = @("+textFilePaths.join(",")+"); $dest='"+path+"\\"+input+".txt'; $v='';"+
		"for($i=0;$i -lt $f.count;$i++) {$v+=gc -lit $f[$i] -en utf8 -raw;$v+=[char]13+[char]10;} sc -lit $dest -val $v -enc utf8;}\"";
	if (mergeTextFiles) shell.Run(cmdLine, 0, true);
	else DOpus.Output("Skipping merge of txt sidecar files (no full set found).");
}

function PrepItems(items, property, prefix, suffix, addQuotes){
	function PrepParam(p, defValue){ return typeof p==='undefined'?defValue:p;}
	property = PrepParam(property,"name"); prefix = PrepParam(prefix,"");
	suffix = PrepParam(suffix,""); addQuotes = PrepParam(addQuotes,true);
	for(var i=0,names=[];i<items.count;i++)
		names[names.length] = (addQuotes?'"':'')+prefix+items(i)[property]+suffix+(addQuotes?'"':'');
	return names;
}