I’ve related (starting here) my long struggle to get my photo work indexed by Google – an uphill battle. But recently, I tangled with sitemaps and eventually made a breakthrough.
I lost a big chunk of indexing in April, when Google switched to “mobile first” page evaluation. Eventually the bleeding stopped; some of my blog was indexed, but hardly anything in my gallery subdomain. A few more pages showed up in the following weeks, then progress stalled.
I’d submitted sitemaps for the blog site and the gallery subdomain, but they apparently never worked. Google Search Console showed weird and contradictory results: “success” in reading the sitemap index file, but “unable to fetch” for one of the 4 referenced maps, nothing for the other 3, and a big fat 0 for discovered links. The “last read” date was stuck 6 months in the past. In reality, the sitemap files were accessible and correct, but the only coverage reported was under “indexed, not submitted in sitemap”. It didn’t make any sense.
I tried re-submitting the sitemaps – that bumped up the “submitted” date, but “last read” never changed. Nothing more got indexed.
I posted all this in Google’s “Search Console Community” forum and got some replies, basically offering these answers:
- Don’t worry about what Search Console says; things are probably fine. Just keep waiting.
- You don’t need sitemaps anyway, Google will find your pages without them.
- Google never indexes an entire site no matter what. Quit complaining.
It was actually sort of weird: Gold and Platinum-level “Product Experts” wouldn’t even acknowledge the obvious errors and contradictions in what Search Console was reporting. After a few days of being blown off by these Google toadies, I decided to figure it out myself.
I tried completely removing my sitemaps from GSC and resubmitting them. Immediately, all the bad data reappeared, including the “last read” date 6 months in the past. So obviously, these results were cached. I then cleared the map entries again and waited several days before resubmitting. No luck, the same old junk was restored. Whatever had gone wrong the first time Google tried to read those maps, Google was never going to forget it.
My blog’s sitemap was automatically created by an SEO plugin; I couldn’t affect its content or change its file name. So I installed a different SEO plugin, and submitted the new and different sitemaps. Bingo.
The bad results were gone; within a couple of days, GSC reported success in reading the new map, and showed hundreds of “Discovered” URLs from the sitemaps. “Discovered, not currently indexed” is good; URLs in that category have a chance. “Crawled, not currently indexed” is the discard pile; Google looked at those pages and didn’t like them.
And even though I’d only replaced the sitemap for the blog site, things improved for the SmugMug gallery subdomain too, with 682 new URLs now in the queue for possible indexing, and a handful already indexed. Apparently Google was now looking at the site fresh, using the new sitemaps.
My indexed count initially dropped again, by a few pages, but soon started moving up – see that bump near the end of the Coverage graph, at the start of this post? I am now – as they say – cautiously optimistic.
UPDATE: Google now seems a bit confused.