Eternity, in Google Limbo

Definition of in limbo
1: in a forgotten or ignored place, state or situation
2:  in an uncertain or undecided state or condition

In a series of posts (starting way back here) I’ve related how I’ve tried to get Google to index my photography site, including this blog and an image gallery on SmugMug. It’s been a long slog, with some successes and some reversals. But hope of a total victory has ebbed.

In a previous post I detailed how I finally got Google read and parse my sitemaps, and hoped those hundreds of URLs for photo pages would eventually get crawled and indexed. But for the most part, this hasn’t happened; those URLs are forever trapped in some sort of Google limbo, neither blessed nor condemned – just ignored. For 6 months.

At this time, I have 258 pages indexed according to Google Search Console, which is an all-time high; but the great majority (227) of those are blog posts and related URLs. The Sitemap section of GSC shows the “gallery” map successfully parsed, and 645 URLs found; but only 21 – seemingly chosen at random – have been indexed. The remaining 621 are “Excluded”, and this is where it gets interesting.

Of those 621, 3 are “Crawled – currently not indexed”. That’s Google saying “no thanks”, they’ve been looked at and not found to contain enough of what Google sees as value.

Another 19 are “Duplicate, submitted URL not selected as canonical”. These pages are just like any others in the gallery, there’s nothing unusual about them; so I’d say this categorization is just an error on Google’s part, and there’s no obvious way to find out why it happens.

The other 602 are still listed as “Discovered – currently not indexed”. That means they’ve not been crawled by Google – and it’s looking like they never will be, unless I do something. But what?

I posted this question on Google’s “Search Console Community” and was told that “…Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl.”   That’s obviously not the case here because SmugMug has plenty of server capacity. I think what’s really going on is that Google just doesn’t doesn’t like “photo gallery” sites, sees them as not worthy of indexing, and when it detects one it just crawls a few randomly selected pages and leaves the rest in… perpetual limbo.

There may not be a way to win at this game. A critical weakness of Google is that its idea of a site’s “value” is based only on text, not images; it does me no good to have the best photos of Minneapolis on the web, if Google sees nothing but a bit of alt text with the word “Minneapolis” in it. And I can’t pad the description of every photograph out to the length of a Wikipedia entry.

Obviously sitemaps haven’t been the charm I was hoping for. Next, I’m going to try submitting some photo and gallery pages manually and see if that moves the needle.

