In their effort to stop search engine spam, the search engines have implemented new algorithms and filters that will ban out of the results pages that are exact replicas of other pages or to a high extent similar.

Since the post about Google’s similarity engine many clients have asked me how to determine similarity. Search Engine Oracle employs professional software and other tools to analyse a website. Our first analysis is free, but we cannot promise you the same generous service for each page you add at your website afterwards. For such situations, there are some free tools available online: it’s enough to perform a Google search on “similar page checker” and you’ll find plenty.

Although free, these tools are not as effective as you might think. You always have to insert, manually, the urls you want to check. It’s a time consuming effort, I’d even say a waste of time: how exactly are you going to determine which urls to check?

Don’t worry. There is a way. The similarity between pages comes from the content. While the topic can be similar (similar doesn’t mean exactly the same), the content should be unique.

Copyscape is one of the free tools that can determine plagiarized content. This is a good tool to identify pages with identical content and scraped content. When your web page is the original source you don’t need to worry. The search engines are able to determine the source of the original content. Although not all experts agree upon Google’s ability to do it, there are no proofs that it may not. The tricky part might come when you take an article from a free content provider (like Ezinearticles) and you publish it on your site. That page might be filtered out of the search engine results or sent to the supplemental index. Yet this will not affect the rest of your site.

Leave a Reply

You must be logged in to post a comment. Login »


Google Adwords Qualified