Search engine optimization is an amazing field. I'm grateful that working in "SEO" has given me the opportunity to serve as a leader, a coach, a teacher, a strategist, an economist and an investigator. After nearly 15 years working in the search marketing industry, I still learn something new each day.
Now, for my guilty admission: I've never published an article on SEO. Not a single one. In the spirit of giving back to a community that freely shares so much wisdom and knowledge, here are 15 mistakes, lessons and observations that I've experienced over 15 years of search marketing:
1. Relevancy Can Hinge on a Single Word
A site that I managed ranked in the first position for the search "best phone camera" in the summer of 2013. That changed when Nokia announced the "Lumia 1020," a 41 megapixel smartphone. The internet swooned and described the phone in superlative terms within minutes of the announcement. All of the news articles (and user searches) that mentioned Lumia 1020 in the same breath as "best phone camera" trained Google into thinking that the Lumia 1020 was the best phone camera. My #1 ranking turned into a page four result.
When we updated our page to acknowledge the 1020's existence, we rocketed back to the first position. This was no fluke. I've seen cases where the absence (or existence) of a single word torched a page's rankings or made its fortunes.
2. The 304 (Not Modified) Status Is Useful on Large Sites
304 is one of the most obscure HTTP status codes. It's a way to tell a crawler that nothing has changed since it last visited a webpage. Small sites needn't bother with this status code, but sites with millions of pages can benefit from using it. In short, it's a tool for managing crawl budget. The trick is determining what counts as a modification to the page. For example, if the links in the right rail change daily, is the page modified? Using the 304 status code effectively is both art and science.
3. Invisible Characters Can Wreak Havoc
I once made an innocuous change to a robots.txt file that prevented an entire domain from being crawled. This isn't a horror story, thankfully. A member of my team discovered the trouble within minutes (always use the robots.txt Tester in Search Console!). Even so, it took us an hour to discover the root of the problem: There was a phantom character in the robots.txt file.
The character was invisible in a web browser and invisible in most text editors. We finally discovered the ghostly character in a true plain-text editor. Beware: An invisible character in a redirect map or Apache configuration file can be just as deadly.
4. Even-Numbered 50X Errors Are Insidious
Nothing puts a chill in my bones more than server errors, specifically, 500, 502 and 504 response codes. (A 503 Service Unavailable response is a legit way to handle a page or a site that is temporarily down.)
I've learned the hard way that for every real person or search engine crawler that encounters a 50X error on a site, 5 to 10 visits are lost over time. That's because search engines quickly deindex pages with server issues and users may become jaded by the error screen. (I bet the 50X error screens on most sites aren't as pretty or as useful as the 404 page.)
I once worked for a financial media company that had the misfortune to botch a content delivery network migration the evening that Steve Jobs died. A flood of users hit a plain white screen that said nothing except "502." Organic search traffic fell by 40% in the following weeks and it took months to recover to its prior trend. I've seen similar situations (and aftermaths) play out enough times to keep me up at night.
5. Google Can Make Big Mistakes
Remember Google's authorship program (that displayed author photos in the search result pages)? I sure do. I had encouraged the staff writers of a small business website to create Google Plus accounts so they could take advantage of the program.
In December 2013, Google announced that they would show fewer author photos. Shortly thereafter, the small business website's traffic and rankings went haywire. I spent hours of panicked investigation trying to discover why some of the site's articles were completely dropped from Google's results. Then, I discovered the common thread: authorship.
Google may have intended to stop displaying an author's photo, but a bug caused the entire article to disappear from Google's results. Google fixed the bug within a few days and the site's traffic and rankings returned to normal. <wipesbrow>Whew!</wipesbrow>
6. Beware Having Multiple Timestamps on the Same Page
I once saw traffic to an evergreen article drop by 96% even though search demand was steady and the article was recently updated. When I checked the search results page, Google was showing a timestamp from three years earlier. Where did the old date come from? The first comment on the page.
All of the dates on the article were wrapped in the correct schema tags (including the timestamps on comments) and Google shouldn't have been confused, but it was. One of our engineers solved the problem by changing the timestamp format on comments to appear in minutes/hours/days/weeks/years ago. It's OK to have more than one date on a page, but to play it safe, make sure that the timestamp of an article is the only date that appears in an International Standards Organization (ISO) format.
7. WebSub Is Remarkably Effective
WebSub (formerly called PubHubSubBub) is a realtime way to let subscribers know that a feed has been updated. It's an old technology that never really caught on, but Google still supports it in 2018. Wordpress supports WebSub natively and that's how I learned how effective it can be.
I've seen Google index a news article within seconds of it being published, even though the site that published it isn't known for breaking news. The URL that Google indexed had an RSS feed tracking parameter on it (Google eventually scrubbed the URL down to its canonical root).
8. Obscure Meta Tags Can Be Powerful
When Google overhauled Google News in May 2018, the "standout" tag didn't come along for the ride. Too bad. The standout tag was a great way for a lesser-known site to force its way into a popular cluster of headlines. In November 2012, I used the standout tag on a story that predicted President Barack Obama would win the presidential election. The resulting placement in Google News drove 1.2 million visits over 24 hours.
9. Many SEO Browser Plugins Have a Major Flaw
Imagine that a webpage located at https://www.example.com has an empty canonical value of <link rel="canonical" href="" />. Every browser-based SEO plugin that I've tested says that the canonical value of that imaginary page is: https://www.example.com.
Plugins are great, but there is no substitute for checking code.
10. Those 'Broken' URLs May Not Exist
Google is good at executing Javascript and that includes following relative paths that appear anywhere in the source code. I discovered that Googlebot was treating random directories within ad tags as "links" and reporting those paths as broken pages in Search Console. To keep the error report clean, Googlebot can be blocked from directories like these via the robots.txt file.
11. The Cache in Google News Doesn't Always Match the Main Search Results
I updated an old article with new information and submitted it to Google for re-indexation via the "Request Indexing" tool in Search Console. Google processed the changes within minutes, updated the timestamp in the search results and refreshed the date on the cached page. But, Google News never acknowledged the update in any form.
12. Content Delivery Networks Can Cause Redirect Loops
When reversing a redirect, the destination URL may become unreachable for all users because of a conflicting redirect rule that is cached by a content delivery network (CDN). This can be avoided by flushing the CDN's cache after modifying a redirect map.
13. Site Improvements Are More Impactful on the Way Up
Search engine penalties are like debt: Compounding works against the site once it’s in a hole and it gets increasingly harder to dig out. I've seen tarnished sites make monumental improvements only to see search traffic rise by 10 to 15% after years of waiting. Conversely, I've seen high-quality sites grow search traffic by 40% within months of making modest improvements.
14. Entire Domains Can Be De-Indexed in Search Console
In the early days of Google Webmaster Tools (the predecessor of Search Console), it was easy to accidentally de-index an entire site. All that was required was to leave the field in the "Remove URL" tool blank and press the button. Within hours, the entire domain would disappear from Google's results. If this command was revoked within a few days of executing it, every page on the site would return to its prior rankings like nothing had happened.
Thankfully, Google made it a bit harder to accidentally de-index a domain in Search Console (now, there is a confirmation prompt). There are good reasons to use this feature, like when a derelict subdomain needs to be quickly nuked from Google's index.
15. The 'Source URL' for Sites in Google News Doesn't Automatically Update After a HTTPs Migration
After moving a site to HTTPs, it’s necessary to contact the Google News team to tell them that the source URL has changed. Here is a link to the contact form.