For a while, I thought I was losing it.
Anyone who sells photos online, via POD or direct, knows the importance of keywords – they’re the bedrock of SEO. And if you use a POD or two in addition to your own site, you don’t want to be hammering in keywords more than once, so you embed them in your JPGs as IPTC data, and the PODs parse them out. Well, the good ones do.
Recently I’ve been going back through all my photos and improving the keywording, which in many cases was bad. For one thing, somewhere along the line I’d used a keywording tool that helpfully alphabetized them, which is a disaster because some sites treat the first keywords as the important ones for searches. Others were stuffed with keywords from back when I did microstock; for PODs you just want keywords that describe the image, not possible uses for it.
Editing all my keywords, on hundreds of photos, seemed like a good COVID-19 lockdown project. I mostly use an old program called PixiShot, although you can also do it right in Windows File Explorer via the Details pane, or via various other applications (most of which are poor tools for the job). I’d do a bunch, then take a break and come back later – and after a while, realize that my earlier changes were gone, and the keywords were back to what they were originally. This is the “thought I was losing it” part.
A lot of confusion, aggravation, and thrashing around in image editing applications ensued, as I tried to figure out what was happening. I’ll skip that part.
Eventually I caught the gremlin in the act. I’d just added keywords to a photo, moved on to another, when out of the corner of my eye I saw the thumbnail of the first photo ‘blink’. I realized some other process had just modified that file, and Pixishot detected the change and updated its thumbnail. But what was that process? The answer turned out to be OneDrive, Microsoft’s cloud storage, which I use as backup for my photos.
OneDrive monitors the folders you select and automatically syncs with the MS cloud. When you change a file it’s quickly detected and uploaded. What was happening was that OneDrive uploaded my changed file, and then, a minute or so later, downloaded a copy of that file in which IPTC changes had been reverted to whatever was previously on OneDrive. Once I realized this was happening, I could see the action in real time, in OneDrive’s activity display. Changed file detected…. uploaded…. wait a bit…. same file downloaded with IPTC as it used to be.
Why would this happen? I suspect a combination of 2 things. OneDrive is using the files’ metadata for its own purposes, maybe something to do with it’s “auto tagging” of photos. And in doing so, it’s not properly preserving the rest of the metadata. Unwanted feature, meet bug.
Of course, I Googled this extensively, hoping to find an explanation, a workaround, or at least others in the same boat. I didn’t come up with much, but found enough dark stories to confirm that something’s going on. One guy uploaded thousands of old family photos and found out OneDrive had changed all the “Date Taken” fields, destroying the chronology, and he was not happy. Here’s the link:
At that point I gave up on OneDrive as my photo backup. I’ve verified several times that this is happening, using just File Explorer to modify the keywords, then watching OneDrive revert them a minute later. Whatever is going on, it might be a bug, it might be a feature, or some toxic amalgam of the two. It might involve the way applications lock files in use. It might relate to something I’m doing with IPTC, but I have no clue what that might be. I’ve used several well known metadata editors to look at my IPTC, and there’s nothing wrong with it; PODs and other sites parse it fine. And yes I’ve used Phil Harvey’s legendary ExifTool, and it says I’m fine.
I’ve spent way too much time on this, and I don’t want to live in fear of finding hours of work undone. I can live without OneDrive.
I’d really like to hear from anyone who’s encountered this, or heard about it; or maybe not encountered it, despite using OneDrive for photos with IPTC data.
4 Replies to “OneDrive photo backup: keywords vanish in the cloud…”
Thank you for reporting this. I am embarking on a project to tag all my old photos and I don’t think I could bear it if I lost my hard work. I don’t have the attention span to redo it if they got corrupted. I barely have the attention span to do it in the first place. I am terrified of this happening. I wish someone smarter than me would develop a test suite to analyze whether a platform is “metadata safe.” I would pay decent money for such a tool.
The thing is, even if a cloud storage system tests ‘safe’ today, it could change the next week.
Storage services are competing on “features”, like organizing your photos and making them searchable. So they’re adding the ability to tag photos, and storing that data in the file itself as metadata. If they make any mistakes in preserving existing metadata, we won’t even know until it’s too late.
Jim, thank you for posting about this issue. I work freelance as an image cataloger and also thought I was losing my mind watching all my metadata edits revert instead of update. I use a metadata palette and Adobe Bridge and could literally see the keywords disappearing after I had saved the data. After confirming the issue had nothing to do with the palette, or Bridge (though Bridge is certainly not foolproof) I tried pausing the OneDrive sync while working on cataloging. Perhaps that near-simultaneous sync was the issue. Nope. As soon as the sync was restarted, *poof!* files downloading again and metadata disappearing, as you’ve observed. I tried toggling off OneDrive’s tagging “feature” (so-called). No difference. I tried opening the file history and “restoring” a previous version and watched OneDrive restore (supposedly), upload, and the seconds later download the version without data changes again. It’s utterly insane. Like you found, the only current option, if you don’t want to see all your metadata work undone, is to shut off all or part of OneDrive. OneDrive limits your choice of what files to back up to three big “buckets” — desktop, documents, and pictures; it unfortunately won’t let you get more granular than that. So I’m now trying a work-around by shutting off the pictures bucket as this seems to be the part of OneDrive which is most borked. Keep us posted if you happen to hear about any further solutions to this glitch / bug.
Heidi, I really appreciate the sanity check – and seeing that nothing has changed since I first hit this issue. I’ve given up on OneDrive – even if I found a workaround, I’d be afraid that they’d do some other crazy stuff in the future.
I don’t even use cloud storage right now. All my saleable photos are on a POD and in a SmugMug gallery, so that’s adequate backup of my final JPGs, and I’ve quit worrying about the original raw files. I do my own backups to a removable flash drive, and I sleep soundly.