Hello diegocr
It seamed that my previous posts were not enough to understand the issue sufficiently, so I picked an elaborate response from my email archive to another reporter (got quite a few "bug reports" regarding this topic).
So the mentioned example with the COM marker is not related to your image, it was another case.
The other examples here in the previous posts were Adobe generated files and I had to use JPEGsnoop first to find the position (offsets) of the SOF and SOS markers, because there was plenty of Adobe marker stuff to skip. With these offsets I could go to the Hex editor and edit.
So yes, it is good if all marker baggage is stripped off, then you can easily find the essential markers in the beginning as in your case.
I didn't check your image because you told it comes from Python.
I'm not familiar with Python, but I had reports regarding this issue from Python users.
So I learned that Python has integrated libjpeg and can even use different libjpeg versions.
The problem reported was reading externally generated files with newer versions, which is clear as described. There were no problems reading Python generated JPEG files.
I have now checked your file and it looks perfectly OK to me.
As mentioned, any libjpeg version, old and new, should write correct CMYK files.
I have still Opus 13.5.1 running, because I hesitate to install 13.5.2 which they say "fixes" something in this regard, but as said I see nothing that could be fixed here, and it is OK as it should be. Again: There is nothing to fix here. Any "fix" here could only make it worse. Only Adobe can fix it.
Here is your file with djpeg:
F:\>djpeg -v 00031-3170241409.jpg nul
Independent JPEG Group's DJPEG, version 9f 14-Jan-2024
Copyright (C) 2024, Thomas G. Lane, Guido Vollbeding
Start of Image
Adobe APP14 marker: version 100, flags 0x0000 0x0000, transform 0
Define Quantization Table 0 precision 0
Define Quantization Table 1 precision 0
Start Of Frame 0xc0: width=576, height=768, components=4
Component 67: 1hx1v q=0
Component 77: 1hx1v q=1
Component 89: 1hx1v q=1
Component 75: 1hx1v q=1
Define Huffman Table 0x00
Define Huffman Table 0x10
Start Of Scan: 4 components
Component 67: dc=0 ac=0
Component 77: dc=0 ac=0
Component 89: dc=0 ac=0
Component 75: dc=0 ac=0
Ss=0, Se=63, Ah=0, Al=0
PPM output must be grayscale or RGB
You see there is still the custom Adobe marker. It is unnecessary here but may make Adobe happy.
This was my concern: If some transcoding process loses the marker, you will still have all essential information for proper image reconstruction in the file. The SOF marker is essential for image reconstruction anyway so this is a better place for colorspace identification.
Here is the corresponding source code in file jdapimin.c, function default_decompress_parms() with the hex values of the component identifiers:
case 4:
cid0 = cinfo->comp_info[0].component_id;
cid1 = cinfo->comp_info[1].component_id;
cid2 = cinfo->comp_info[2].component_id;
cid3 = cinfo->comp_info[3].component_id;
/* For robust detection of standard colorspaces
* regardless of the presence of special markers,
* check component IDs from SOF marker first.
*/
if (cid0 == 0x01 && cid1 == 0x02 && cid2 == 0x03 && cid3 == 0x04)
cinfo->jpeg_color_space = JCS_YCCK;
else if (cid0 == 0x43 && cid1 == 0x4D && cid2 == 0x59 && cid3 == 0x4B)
cinfo->jpeg_color_space = JCS_CMYK; /* ASCII 'C', 'M', 'Y', 'K' */
else if (cinfo->saw_Adobe_marker) {
Regards
Guido
JPEG developer