Text Corruption in PSD image metadata

When I save a file in Photoshop CC 2017 in JPEG format the text looks life this:

Absolutely fine.

If I now make a small edit to the picture and re-save it in PSD format, the file looks like this:

It does not happen every time, but often enough to be a nuisance. Occasionally there is corruption in other fields, too, the Photgrapher field often falls victim to the problem. The corruption is not always of the same characters. For example  is pretty common.

Any way to prevent it happening?

What does Photoshop display for the same file if you look at its properties there?

Can we have an example file with the problem?

Can you give detailed steps on how to create a file which has the problem? e.g. What is modified between the initial save and the re-save? Is Photoshop closed between creating the two versions of the file? Does Opus have to be looking at the folder when you save the file for it to happen, or anything like that?

Does it only happen when certain characters are used somewhere in the metadata? The characters look like UTF-8 is being interpreted as the wrong encoding.

I've tried a few things but haven't seen the problem here so far:

Naomi Watts - Photoshoot For Grazia Magazine (France) May 2017.zip (12.7 MB)

The attached zip file contains a JPG and a PSD of the same image that exhibit the problem.

The first thing I must stress is that this does not happen every time I use Photoshop to make a PSD of an image but it happens enough to be a nuisance in that it has to be manually corrected.

To create this problem I opened the JPG is Photoshop 2017. I then cropped the image slightly to make sure that Photoshop would have to re-save the PSD from scratch, as it were.

There is one character in common with most of the corruption - but not all - and that is the copyright symbol, which is inserted by the VBA program I use to edit metadata.

I have since taken the original JPG and simply saved it out of Photoshop as a PSD without any edits. The corruption still appears. I then closed Opus and re-saved the JPG again. The corruption is still there.

Next I removed the Copyright info using Opus Metadata panel. I then re-inserted the the copyright symbol using the Alt key and 0169 on the numeric keypad. When the JPG was resaved (no edits) as a PSD, the corruption was still there.
This kind of makes me think that the corruption is probably not being introduced by my VBA program.

You may recall that I had a problem some time go with Photoshop "eating" the keywords I entered not files. Jon eventually found that problem thank heavens. Now I have started re-keywording the files that were effected by the last bug, this new problem has come to light.

I also believe I have seen this error before in a previous Opus version, but it seemed to go away.

Leo
Could you please hold off any further investigation on this one for a wee while.

I have amended my VBA to use the Chr (169) command to insert the copyright symbol instead of having it as a literal character in the script.

So far I have not been able to reproduce the corruption. Obviously I need to do many more pictures to confirm this. I will report how I get on in a couple of days.

Well, that theory did not last long, but at least I have found a way to reproduce the corruption almost every time. If I re-save a PSD file with a mask (Alpha channel) the corruption happens all the time.

This image shows he very typical corruption I am getting

Is this only when your script is involved? If the metadata is displayed incorrectly and a script is editing the metadata, that's a very important detail.

What is the full process to create the problem, starting from scratch (i.e. opening Photoshop and creating a document)?

What does Photoshop itself show for one of the problem files? (Is the issue the file and how it was made, or a difference in how the file is displayed in the two programs?)

This is the workflow:

Open a JPG image with metadata in Photoshop. Mask the image for later use in a composite. Save the masked image as Photoshop PSD file.

When there is corruption you can see the corruption in opus, explorer and Photoshop.

However, and this is problem, producing the problem at will is impossible. This afternoon, for instance, I produced at least six corrupted images in a session adding metadata to images.

This evening I took a test image I added metadata in explorer applied the mask and saved out via Photoshop. The metadata showed no corruption. I then deleted all the metadata and created new metadata using the Opus Metadata Panel. I saved out a masked PSD in Photoshop and viewed the resultant files in Opus, Explorer and Photoshop. No Corruption.

I created the last action but this time using my script - again no corruption.

It seems that from time to time extraneous characters are introduced by something into the metadata streams.

I have been able to create the corruption by using the Opus Metadata Panel and NOT my VB script.

Finally to try to show that the problem lies somewhere in Photoshop process of converting to PSD files, I ran my script of a folder containing 500 JPG images. Not one showed any metadata corruption.

I think my next step should be to continue testing, but this time use images I have taken with my own DSLR and try to see if I can introduce the corruption into those files. My gut feeling is that I will not be able to do so. We shall see

So Opus doesn't need to be involved at all for it to happen?

1 Like

I have been doing a lot more experimenting, and I am no longer sure we are chasingt he right problem here.

I decided to to look at random images - taken with my own camera, downloaded from the web FTp'ed etc - in ExifTools.

In every file I tried ExifTools showed UTF-8 characters in various metadata fields.Now, this may be perfectly normal, for all I know, but these characters are there - sometimes in their tens.

With each file I looked at in ExifTools I looked at the metadata in Opus. There was no corruption shown. But what was interesting were the characters that showed up in the usual field that can have corruption - i.e. The Copyright field. They nearly always seemed to be the same characters and caused no problems.

I then set ExifTools on the original image I sent with my first post, where corruption does show. The characters EXifTools showed up seemed to be different from the rest. I have added a file to show this.

At no point in this work was Photoshop involved.

Naomi Watts - Corruption Exif Tools.zip (12.7 MB)

This doesn't seem to have anything to do with Opus.

This seems to have much to do with Opus. I am entering the metadata with Opus and using Opus to read the metadata.
In fact, it is one of the main reasons I use Opus. It happens to be darned good at entering metadata programmatically. I am entering metadata at least three times faster than i have ever done using Opus, and when you caption and keyword lots of images that is quite a saving in time. It seems a shame that this excellence is let down somewhat.

Some research has revealed how Photoshop allegedly treats the copyright symbol:

Default Photoshop - encoding of copyright symbol in metadata
Photoshop encodes the copyright symbol ("©") in all metadata (Exif, IPTC IIM, and IPTC XMP) as the sequence of characters C2h A9h. [In Windows Code Page 1252, the usual "extended ASCII" encoding used inside Windows, that would be interpreted as "©".]**

Here's the story.

IPTC XMP metadata

IPTC XMP metadata is encoded in UTF-8 encoding. In UTF-8, the character "©" is not encoded as the single byte A9h (as it is in Windows Code Page 1252). Rather, it is encoded as the two byte sequence C2h A9h.[Only ASCII characters get single-byte representations in UTF-8.]

Thus, the encoding used by Photoshop for "©" in IPTC XMP metadata (C2h A9h) is appropriate.

Any XMP interpreting program should render this on screen as "©".

IPTC IIM metadata

IPTC IIM metadata ("legacy" IPTC metadata) can use several encodings. The encoding used should be indicated by a data item, CodedCharacterSet.

IPTC IIM metadata generated by Photoshop indicates the encoding as UTF-8. Thus, the encoding used by Photoshop for "©" in IPTC metadata (C2h A9h) is appropriate.

Fully-observant IPTC IIM metadata XMP interpreting programs should render this on screen as "©".

For some reason which only clever programmers like you guys can work out, Opus appears to be "misinterpreting" the symbols and displaying the C2h character as well, when parsing the XMP.

But you also see the same thing when entering the metadata using Photoshop and using Photoshop to read the metadata, correct?

Multiple posts above talk about processes which only involve Photoshop or other tools, and where Photoshop and Explorer both show the same results as Opus. That implies either a problem with the file format itself, or with part of the process (or one of the tools/scripts involved), or the same bug in all three programs.

As this thread goes on, the steps to reproduce the problem seem to keep changing, as do the programs/scripts/tools involved in the steps.

It's very difficult to know what we are actually talking about here.