In my previous post, I eluded to the fact that I had discovered another form of duplicate content on one of the sites I monitor. I will discuss it today, but just for the sake of disclosure, there will be a part 3 coming shortly, as I have discovered a third culprit today. It is amazing the different things that can be overlooked that eventually lead to possible duplicate content dings from the search engines.
As promised, in my quest to find the source of a search engine traffic decline, I went on the hunt for possible causes. In my last post I talked about the first thing I found that could be contributing. In short, the internal search engine (and Google for that matter) was indexing two urls for each page. One with the title in the url, one without. A quick database update and that was fixed.
Today, I’ll discuss the second possible culprit I discovered. I did a keyword search in Google to test placement, and to my surprise I found a listing that was titled “Untitled Document”. I was amazed, because all the pages of my site have been optimized, and none should be without a meta title. Upon further examination, the page that the listing linked to was an internal search engine results page. Come to find out that various spam bots were hitting the site, typing in innocuous words such as “face” or “legal” and then indexing the results pages on a “spam” directory page. This resulted in 900+ pages of our site being indexed with the meta title “Untitled Document.” If that isn’t screaming to Google “We are spammers” I don’t know what is.
OK – once again, now that we’ve identified a problem, on to a solution. A somewhat quick fix allowed us to change the meta title of all internal search engine results pages to be dynamic. Now, when a user types in a query, the meta title auto populates with the search results. This solved the duplicate content issue, but not the fact that spam bots were using our urls in their fake directories. More on that and “bad neighborhoods” in a future post maybe.
Thankfully the site puts out enough content on a regular basis that the engines index us quite frequently. After about a week of the solution being in place, a site search for the words untitled document show only 8 results, compared to over 900 a week and a half ago.
The point of this story – Any time you have automated pages in your site, check the meta information occasionally to be sure it is populating what you want it to. Otherwise, spam bots getting ahold of it and posting for the engines to index can cause worlds of problems.
As I said previously, today I came across another form of duplicate content. This time making every page of the site appear duplicated. More on that in the next post.