Google’s Panda Update – A Useful Article

Being some very interested in the updates Google run every now and then I was very interested in what would happen to the sites we were optimising when Panda ‘hit’. I am pleased to report that, in common with all the other updates in the past that our sites were not damaged at all ranking wise and all continued their upward march.

Still, for those that are interested, the blog below is very informative indeed, despite (as with all of the blogs on this subject) not really knowing 100% what Google is really up too.

On the matter of using copied material in this blog, I am of the opinion (along with the auhor of this information (I think) that copying is OK, as long as you give credit to the source. I also beleive that Google (to a degree) is quite happy about copied material, as long as something ‘extra’ is added to the pot.

Anyway, to the ‘nitty gritty’ the actual content of the blog:-

*****
See the what they have to say
******

Despite much discussion, analysis, and guesswork across hundreds of Websites, it seems that the true nature of Google’s Panda Technologytm continues to elude the algorithm chasers. Maybe it’s time for a reset in terms of what people think Panda is and represents and does. But rather than try to persuade people to accept whatever I might think or conclude, I feel it would be more productive to go down the list of things we can all easily verify about what Panda is or does.

The Panda Technologytm was implemented as a document classifier

A document classifier is a computer program that performs a specific function or set of functions in evaluating documents for a searchable index. A few examples: Classifiers may identify or label or classify documents, tagging them for further processing; classifiers may sort documents into groups; classifiers may score documents against measurement signals; classifiers may reduce or alter document data; classifiers may deduce, tally, or compute data about documents or document sets and associate the results with documents; classifiers may scan documents for specific qualities.

A simple definition for “document classifier” might be a special program that solves a specific problem for “a document space and a set of document classes”. Okay, that’s somewhat jargonistic. But think of the primary function of the document classifier as performing a task that solves a problem. The problem might be, “How do I divide this set of documents into ‘stuff over here’ and ‘stuff over there’?”

In the February 2011 Wired Magazine interview, Matt Cutts said: “…we actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side.”

The Panda Technologytm appears to have helped some scraper sites

I have trouble finding queries where scraper sites are outranking their original sources. I see many people complaining that their sites have A) lost traffic since being affected by Panda and B) are now being outranked by scraper sites.

It is pretty easy for me to find scraper sites. I need only take article titles from sites like SEO Theory, SEOmoz, Search Engine Land, and Marketing Profs and search on them. Marketing Profs was complaining about being outranked by syndication partners earlier this year but when I performed some queries on their article titles just before writing this article, I found their site ranking above all the others.

That is the case for SEO Theory, SEOmoz, and Search Engine Land. I have known for years that these sites were being scraped by Web spammersand I know this blog gets scraped a LOT.

So what’s the difference between me and someone whose site is being outranked by scraper sites? I’m pretty sure that SEO Theory hasn’t been downgraded by Panda.

The Panda Technologytm is looking at sites

According to the Google Blog post that Amit Singhal and Matt Cutts published on February 24, “This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.”

The Panda Technologytm uses “new signals”

In a February 25 blog post, Wall Street Journal writer Amir Efrati stated that “Singhal did say that the company added numerous ‘signals,’ or factors it would incorporate into its algorithm for ranking sites. Among those signals are ‘how users interact with’ a site.”

In an April 11 update published on their original Panda announcement, Singhal/Cutts wrote: “We’ve rolled out this algorithmic change globally to all English-language Google users and incorporated new signals as we iterate and improve.”

It’s interesting that the Wired article doesn’t mention user interaction with sites, whereas the WSJ blog post partially summarizes Singhal’s statement. We don’t have a clear indication that Google is actually trying to monitor user engagement with Websites. Nonetheless, this has proven to be one of the areas where certain SEOs have focused much of their attention.

The Panda Technologytm uses multiple signals

On March 9, Vanessa Fox recapped new information about Panda disclosed during the SMX West conference. She quoted “Google’s words” as:

Our recent update is designed to reduce rankings for low-quality sites, so the key thing for webmasters to do is make sure their sites are the highest quality possible. We looked at a variety of signals to detect low quality sites. Bear in mind that people searching on Google typically don’t want to see shallow or poorly written content, content that’s copied from other websites, or information that are just not that useful. In addition, it’s important for webmasters to know that low quality content on part of a site can impact a site’s ranking as a whole. For this reason, if you believe you’ve been impacted by this change you should evaluate all the content on your site and do your best to improve the overall quality of the pages on your domain. Removing low quality pages or moving them to a different domain could help your rankings for the higher quality content.

This appears to be the first time Google associates “low-quality sites” with “shallow or poorly written content, content that’s copied from other websites, or information that are just not that useful”.

Also, Google noted that “low quality content part of a site can impact a site’s ranking as a whole”.

Panda-downgraded sites have reduced crawl

In another post summarizing SMX West Panda information, Vanessa Fox wrote: “Matt also noted that if Google determines a site isn’t as useful to users, they may not crawl it as frequently.”

As Eric Enge’s interview with Matt Cutts revealed:

There is also not a hard limit on [Google's] crawl. The best way to think about it is that the number of pages that we crawl is roughly proportional to your PageRank. So if you have a lot of incoming links on your root page, we’ll definitely crawl that. Then your root page may link to other pages, and those will get PageRank and we’ll crawl those as well. As you get deeper and deeper in your site, however, PageRank tends to decline.Another way to think about it is that the low PageRank pages on your site are competing against a much larger pool of pages with the same or higher PageRank. There are a large number of pages on the web that have very little or close to zero PageRank. The pages that get linked to a lot tend to get discovered and crawled quite quickly. The lower PageRank pages are likely to be crawled not quite as often.

By simple inference, one can conclude that a reduced amount of crawl may indicate a reduction in PageRank. You should review the entire interview, however, as Matt points out some situations where Google may be prevented from crawling a site efficiently by factors beyond its control.

Google ties the algorithm to their Webmaster Guidelines

In his April 11 blog post announcing the international English-language rollout for Panda, Amit Singhal wrote: “Based on our testing, we’ve found the algorithm is very accurate at detecting site quality. If you believe your site is high-quality and has been impacted by this change, we encourage you to evaluate the different aspects of your site extensively. Google’s quality guidelines provide helpful information about how to improve your site. As sites change, our algorithmic rankings will update to reflect that.”

*****

For the full story, see the blog in full here

This entry was posted in Google, SEO and tagged , , . Bookmark the permalink.