[BUG] dopusrt /col /import ignores lines in UTF8 without BOM

Importing a UTF-8 file without BOM (using dopusrt /col /import ...) results only in the lines without unicode characters being added to the collection.
Manually adding a BOM of 0xEF 0xBB 0xBF at the beginning of the input file fixes that behavior and every file is added properly.
The windows console (%comspec%) does not append a BOM when piping cli application output so this becomes an inconvenience requiring another step of adding a useless 3-byte header using a third-party application. The encoding switches work only for export and not for import.

This could be easily resolved if the dopus parser would either:

  1. assume that any file is unicode utf-8 by default and decode respectively without requiring an exclusive byte-order-mark
  2. allow specifying encoding switches for /col import such as /utf8 and /utf8bom

Opus version: 10.0.3.1.4402.x64

I see what you mean.

Option 2 seems the best to me, since text files on Windows are in the OEM/ANSI codepage by default and since we already have similar arguments for the import command.

We've written & tested option 2.

Not sure if it will be in the next release, or in the beta after that, but you should be able to use it fairly soon.

I'm glad it will be addressed. Thank you leo and GP-s for the swift reaction. Looking forward to the upcoming release.