| Web Site Design by Rainbo Design |
Google Canonicalization Problems with WWW in URLA long time ago, server software makers, ISPs and Web-hosting companies started to configure
their servers in a manner that was meant to be a convenience for their users, but
which can turn into a major problem for search engine rankings. These well-intentioned
people decided they would allow websites to be accessed whether or not the URL
included the very common "www" subdomain prefix. This was often a very
handy contrivance for budding webmasters and new Internet users who
would omit the three-character prefix when they typed in a website
address in their browser. But it has come to present a problem from a search
engine perspective because of Google.
When Your Car Is Broken,
|
SEO Tips Menu |
If you've found this page, the chances are that you know about this already and are simply seeking a solution. But for those who have come here unaware, allow me to explain. By the strictest definitions, the two URLs "http://yoursite.com" and "http://www.yoursite.com" are separate and distinct entities. The first is technically pointed to the domain's root directory, and the second is pointing to a subdomain named "www". In the earliest days of the World Wide Web, the contents of a website were actually stored in a directory named with the standard abbreviation "www" by convention. Thus, the common practice of making a website's URL begin with that prefix was born. But as the Internet became more popular, and webmasters and IT managers dictated making allowances for the less techno-savvy in the population, various shorthand methods crept into usage. The one we deal with here is the making of the www prefix optional. I'm sure it seemed a natural thing to do. When referring to a website by its URL, the "www" part is frequently omitted both in speech and in writing, so it was only logical that users would similarly take the same shortcut when they went online. So, rather than frustrate those users needlessly, servers were configured to allow either version to bring up the same content. Users were happy, IT managers were happy, and webmasters were happy. But being the product of computer-based logic, search engine algorithms often fail to understand when they should treat these two URLs as one and the same. Google has remained particularly stubborn about this issue, despite overwhelming evidence of the problems it causes. They even have a page in their support section that deals with it. Google now provides a method for webmasters to select a Preffered Domain in their Google Webmaster Tools. But this tool is only for Google and you should still install the 301 redirect.
The problem is two-fold. First, there is the issue of link popularity. Google's vaunted PageRank system depends on links and it will not always canonicalize (ie. treat as identical) URLs in links that omit the www and the version that includes it. This often means lower rankings for the site for most searches than it actually deserves. Second, and a frequent result of the first, Google won't deep crawl one version of the URL or the other based on either (a) the reduced link popularity/PageRank, or (b) duplicate content penalties. Having the same content available from more than one URL is a violation of the guidelines of all major search engines and this is the most common result of the Dreaded Missing WWWs Syndrome. Fewer pages logged for a site means that once again, one version of the URL is not receiving proper link popularity credit for its own internal links.
So the problem compounds itself over time, and can be especially debilitating to sites that weren't all that strong to begin with. Sadly, webmasters are often partially responsible for this problem because, knowing they can "get away with it", they will use the shorthand version when submitting their site to directories or posting links on webpages of their own design. Once this Genie is out of the bottle, its a long battle to overcome because even if you are able to find every incorrect link on your own site, all it takes is a mal-formed link on an obscure page that doesn't show up in Google's "link:" command to keep this demon haunting you forever. Fortunately, there is a solution.
The solution is to use server control methods to automatically redirect requests to the proper URL. The server must return a "301 Moved Permanently" result code in order for the search engines to properly assign the link popularity and to update their internal records of the page's true URL.
Websites running on hosts that use the Apache server software usually have it the easiest in this regard because they can control this problem on their own using the .htaccess control file. Just create a simple text file named ".htaccess" with no filename extension, and insert the following command:
Simply replace "yoursite.com" in the above code with
your website's domain name. Websites based on Microsoft's IIS Server Software will
need to consult their system administrator for help. Again, be sure the server
returns the redirecting result code #301 or you're only papering over the problem
and not repairing it. A code 302 result is not acceptable.
You can check the code your site returns with my Server Header Checker.
If you want your site to rank higher in the search engines, my Search Engine Optimization Services
can give your website what it needs to get your fair share of search engine traffic quickly, without
disturbing your design, and without breaking your budget.
Search Engine Optimization Tips Main Page
Call Richard L. Trethewey in Minneapolis today at 612-408-4057 from 9:00 AM to 5:00 PM Central time to get started on your new website design package or search engine optimization program today!
Search Engine Marketing and Optimization Services by Richard L. Trethewey
Website Design by Rainbo Design Sitemap
Preferred Resources Page
Affordable Custom ECommerce Website Design by Rainbo Design Main Page
Thursday, 07-Aug-2008 16:24:12 MST