Ugh. This is embarrassing.
If you’re reading this, chances are you know about Wix’s SEO Hero contest. If not, you can read up on it here.
I feel… pathetic.
I’ve never claimed to be an SEO expert but I’ve been doing this thing for the better part of a decade. Surely I’ve learned something along the way, right?
I made a giant mistake.
Maybe it’s because I was sprinting to get my SEO Hero entry site up, making it more likely I’d overlook something obvious. Or maybe it’s because I’ve been coddled by developers my whole career, allowing me to remain a noob when it comes to implementation. Heck, maybe it’s the Illuminati. There’s no way of really knowing.
Here’s the lowdown on my screw-up
After quickly doing some research to find the best possible WordPress theme for this project 1, I got hosting set up and installed it. I wasn’t ready for content yet, but I wanted to get the site indexed ASAP with a teaser page just to be like, “Hey Google, what’s up? Maybe later we can get to know each other and see where things go from there.”
Then I got a *great* idea that was sure to save me some time.
Rather than build the site from scratch 2, I decided to import the demo. From there I could make edits to something that already looked like what I was envisioning, saving me time in the long-haul (or so I thought).
I chose the page I wanted to use from the demo and threw a noindex metatag on it. Then I went through the rest of the Pages and Posts and deleted them.
I launched the site on December 23rd with three pages, one of which was a Thank You page so it contained a noindex metatag. On the 26th, I did a site search in Google to see what had been indexed.
SIXTY-ONE MOTHER LOVING PAGES!
My heart sank. It felt like I just swallowed some of Donald Trump’s hair. How in the BLANK could this be?
It turns out “Pages” and “Posts” only contained some of the other URLs in the demo. Guess what other sections included them?
How I’ve tried to fix it
After throwing up in my mouth, I got to work. Here’s a timeline of events to the best of my memory.
- Deleted all of the other pages, one by one.
- Submitted each deleted page in Google Search Console to be Fetched by Desktop Googlebot.
- Submitted URLs for temporary removal in Google Search Console
- Pray. Pray and drink.
- Pages are de-indexed. YAY!
- Pages are back. EFF! 3
- Updated my .htaccess file to 410 all of the deleted pages
- Found more indexed pages in Google the theme auto-generated that weren’t covered in the .htaccess rules 🙁
- Updated .htaccess file to add more 410 rules
- Put all known indexed pages through Fetch AND Render for Desktop AND Mobile Googlebot
- Added all indexed pages in my XML sitemap
- Confirmed that while Google Search Console shows certain pages as returning a 410 error, even though most were still indexed in Google!
- Realized that somewhere along the way my Yoast SEO plugin overwrote my .htaccess file, making some of my indexed pages 404s again… and yes, Google Search Console showed some were crawled during that time.
- Updated my .htaccess… again.
1/26 – That is today. Well, I don’t know what day you’ll read this, but it’s the 26th that I’m writing this.
Here’s my plan:
- Remove all URLs I want de-indexed from my XML sitemap.
- Resubmit all known indexed URLs through the URL removal tool again. It’s supposed to stay out for 90 days. It didn’t work last time, but maybe it will this time.
If that doesn’t work:
- Add those URLs back to my XML sitemap along with a last modified component (that’s not currently there). John Mueller recommended doing so previously, so I’ll give it a shot.
- Link to all URLs I want to get de-indexed in a Google + post. I hate spamming people, but if it gets to this point, I’ll need another way to send Google the message.
2/7/2017 update: The temporary removal lasted a whopping 10 days, despite Google claiming it should last for 90. I’ve updated my sitemap and linked to all of the pages in a Google+ post. We’ll see what happens…
Why I think nothing has worked
It’s probably karma. If not, my guess is I have too low of a crawl budget for Google to care enough to change anything. Recently, Gary Illyes from Google shed some more light on what crawl budget means. To paraphrase, it’s a combination of crawl rate limit, which gets lowered if you have a slow site and frequent server errors, and crawl demand, which is based on popularity and content staleness.
My site is slow. My popularity is lacking. And with such a new site, my deleted content probably isn’t considered stale. I have plans to rectify all of these things, but this is the best explanation I have right now (besides karma).
One last thought
By no means am I saying I’d be winning if I hadn’t made this mistake. There are tons of great competitors out there and I’d be a fool to claim I was better than the rest. Even with this mistake, I was late in launching the site, my page load time is dismal and my link game has a lot of catching up to do.
However, in contests like these, the margin for error is extremely slim. I will finish out the contest doing everything I can (within the rules) to win. But if and when I don’t, I’ll be looking back at this mistake and kicking myself.
I should eventually write a blog post about how not to do this kind of research. I chose terribly, but that’s beside the point.↩
Probably poor word choice here because nothing is “from scratch” with WordPress ↩