Penalized in Google?
Unwinding Google Penalties
4004

Technical Issues That Can Harm Ranks

Technical Issues That Can Harm Ranks
by: Bob Sakayama
updated 6 January 2015

Where Precision Matters

Some forms of rank loss and even some penalties are triggered by technical errors in the implementation of the website. The rules regarding code syntax are very strict - a one character error or omission can lead to large failures. Even the most savvy developers have made simple mistakes that turned out to be disastrous, because the consequence of the action cannot be immediately known.  For example, a change that accidentally prevents the indexing of a valuable url will only be discovered once the ranks collapse, possibly weeks later.
I prepared a list of technical issues that we've found to be at the heart of rank problems experienced by clients.  Note that everything in the list is actually a powerful tool when used correctly, but can blow up the ranks, not to mention the entire site, if applied improperly.
 

robots.txtrobots.txt

Incorrect use of this file can harm the site's ability to hold rank.

This root file instructs bots not to crawl urls.  That's not to CRAWL.  That's different than not to INDEX.  The difference is huge.  Robots.txt will prevent crawling of a url, but if that same url has a link to it from another site, it can still get indexed.  You can't prevent indexing using robots.txt.  To prevent indexing use a robots meta tag instruction of noindex.

My observation is that very few sites get this right - we see many errors involving improper declarations of User-agent, wild cards, sitemaps, syntax, etc.  And because it's not well understood, it is often ignored after changes have been made that should have been reflected in the file. I view it as an indicator of a well managed site.

Dig Deeper Into robots.txt

This tool can help you get the syntax correct.
Learn more about robots.txt from Google.
Test your robots.txt file in Webmaster Tools

 

.htaccess.htaccess

Incorrect use of this file can not only harm the site's ability to hold rank, it can also take the website down.

This is a very power file used for everything from redirects and rewrites (how you get seo friendly names) to blocking ips, preventing hotlinking images, password protection, etc.

Common errors involve redirects, https, regular expressions

Dig Deeper Into .htaccess

.htaccess - wikipedia.org
Comprehensive Guide To .htaccess - javascriptkit.com
Apache HTTP Server Tutorial - apache.org

 

Domain Name SystemDomain Name System (DNS)

Incorrectly setting up the domain name servers can completely disable the website, or harm the ability of the site to hold rank.

Pointing the dns for a single host to more than one domain will let all those domains be indexed with the same content.  Most devs/hosts have caught on not to do this.

Dns issues are often the root of penalties where subdomains are conflicting with the top level domain. It's very common to find indexed subdomain clones of sites triggering rank failure.

Both the www and the non-www versions of the site require a separate dns value. We strongly advise allowing both, but choosing ONE way to display and index the site to prevent any conflicts/redundancies from being indexed.

Our government at work: I've noticed that for years, the NSA website - nsa.gov - did not have both dns set up as best practices suggest.  Note that you cannot get to the site by typing "nsa.gov" - you have to use "www.nsa.gov".  For a commerce site, this would be unforgivable, not just for the obvious reason that people might not be able to access the site, but because many natural links are given to a domain without the www, so any rank push from those links would be lost.

 

Dig Deeper Into DNS

Domain Name System - wikipedia.org
The Domain Name System: A Non-Technical Explanation - internic.net

 

Redirects

Incorrectly using redirects can harm the ability of a site to hold rank.

There are a number of ways to achieve a redirect.  The 2 most common are using an instruction in the .htaccess file, or using a header instruction on the url itself.  Redirects can be used to point visitors to a replacement url, to move to another domain, repair a broken link, etc.

Common errors include using a 302 temporary instead of a 301 permanent for any url involving content (302 does not pass PR!). Also common is the chaining of redirects. What most people don't realize is that some PR is lost across every redirect.  This means that chained redirects should be avoided.  If file A changes to file B, then redirect from A -> B.  If B then changes to C. Do not chain from A to B to C.  Instead, to conserve PR, use 2 redirects: A -> C and B -> C.

 

Dig Deeper Into Redirects

URL Redirection - wikipedia.org
Change URLs With 301 Redirects - google.com
Sneaky Redirects - google.com

 

rel = "canonical"

Incorrectly using the rel canonical tag can prevent a url from being indexed.

Used to tell Google which url to index.  If the canonical url is different from the url being viewed, the url being viewed will not be indexed.  Canonicals can point to urls on other domains.

Most common error is in syntax - homepage url needs a slash at the end (http://www.domain.com/)

 

Dig Deeper Into rel=canonical

Canonical Link Element - wikipedia.org
Use Canonical URLs - google.com
5 Common Mistakes With rel=canonical - googlewebmastercentral.blogspot.com

 

System Automation

Error in the automation code can harm the ability of a site to hold rank and can take the system down.

Used in every aspect of a modern site, automation is definitely our friend, until a human introduced error is magnified by it.  Some of the search considerations are discussed in my post on re1y, "Automation and SEO".

Simple tools once successfully used to automate content - used mostly for geo targeting - are much more likely to trigger a penalty now as Panda is able to detect templated content.  Common errors: the creation of large numbers of redundant files, pagination routines, date metrics, inventory, etc.  Anywhere data is involved, automation is needed to manage the data, and that process, if it involves a high ranking website, demands informed management.

 

Multiple Sites

Incorrect implementation of multiple sites can trigger domain level penalties.

Businesses with multiple sites must abide by some rules meant to enforce Google's (overly naive) philosophy that money should not influence the search. For example, you may not use owned sites to push the rank of other owned sites. So if you wish to interconnect owned sites via links, be aware of the ways that ownership are revealed.

Most common error is rampant interlinking of owned sites combined with revealing the ownership of the sites.

To preserve the search compliance of multiple site with common ownership, keep them completely independent.  Do not share images, scripts, media, etc. with another owned site. 

 

Scripts

Incorrect use, or errors in implementation can harm the ability of a site to hold rank, or trigger penalties.

Scripts are abused in 2 ways - what they do, and how they're used.

A script intended to do something is basically automation - read the section on this above.

An example of a usage error is including scripts meant for one domain onto another.  If done with Google products, you are revealing common ownership among domains that should be implemented completely independent of each other, as well as contaminating your data.

 

Local site search

Inappropriate implementation of a site's internal search can harm the ability to hold rank, and trigger penalties.

I'm including this as a separate technical item because of how often we find this problem. Often related to automation, the most common issues relate to inadvertent indexing of local search urls. Many sites use internal search to create pages that populate the site.  A problem occurs when the search result urls get indexed because of the very large numbers. We recommend blocking the urls running the search script with a noindex meta tag.