Webmaster Papers




Google
 
Web webhostingpapers.com




/pagead2.googlesyndication.com/pagead/show_ads.js">

Google Search Algorithm Patent Application Creates Spring Buzz!


Google applied for a patent on their ranking algorithm as of 15 months ago on December 31, 2003 and that application was posted on March 31st at the US Patent Office. It got the discussion forums buzzing this weekend. Even though I had substantial work to do and was behind on a project, I couldn't resist the temptation to read the very long 14,000 word, 45 page application and see what it could mean to the volatile world of search.

So I tripped on over to the the US Patent & Trademark Office (USPTO) and started reading the document United States Patent Application: 0050071741 seems to be Google applying for a patent on their search algorithm. There seems to be no reference to PageRank here, but it seems to be PageRank redefined with a few variations to limit link spamming and reduce stale results, along with multiple innovative elements not previously considered.

They discuss link spamming limitations extensively, which would be a welcome relief as Linking Psychosis is rampant and I'd like to see an end to it. Much of historical data related to pages seems to be a bit onerous because it would appear to limit the perceived value of a page unless it becomes wildly popular over time. Bigger is better seems to be a enduring theme of this algorithm as described generically in text of their application.

An odd addition to the historical ranking discussion is amazingly - the "Advertising Traffic" for a particular document! They will rank a site based on the advertiser choosing to advertise on a particular site. If Amazon wants to advertise on your site, then Google will rank you higher!

That's good, I guess, if you have a site that attracts highly rated advertising, and don't rely on cross promotion of your separate products or those of suppliers to appear in your site advertising. Example: If I have a discussion forum on coffee, don't I want to advertise my coffee products? Why would I serve ads from highly rated advertiser Starbucks to rank higher at Google? What if I sell thousands of products and simply cross promote and upsell my own products sitewide? Odd stuff, ranking based on advertisers.

How does affiliate advertising factor into that advertising element of the algorithm? Do they know you are advertising a book from Amazon as part of affiliate program through your direct Amazon affiliate program links and do they recognize tracking links through affiliate management companies differently than the tracking URL's of ad serving monsters like DoubleClick and confer higher ranking upon the big boys of advertising above affiliate tracking firms?

Also seems to call into question their own Adsense ads and how that factors into this algorithm! Do the Adsense ads along my blog border gain more ranking score because it is from a monster advertising company - Google - or is it downgraded because I'm not a "Premium" advertiser serving over 20 million content page views? Again, seems that reward for being large outweighs relevance in this formula. Or does it? How do they value Overture advertising in the formula? Adbrite? Smaller ad networks versus large advertising aggregators?

They extensively discuss historical data related to rankings over time, looking at seasonality, popularity during spikes in traffic due to news coverage of a particular topics and changes in ranking related to those items. The historical data related to ranking over time are interesting since they refer to link spamming, relevance, and topicality when they say:

"As a further measure to differentiate a document related to a topical phenomenon from a spam document, search engine may consider mentions of the document in news articles, discussion groups, etc. on the theory that spam documents will not be mentioned, for example, in the news. Any or a combination of these techniques may be used to curtail spamming attempts."

They've added another interesting element in the algorithm of determining value of pages based on "user maintained/generated data" (patent item 113) read that "bookmarks" and "favorites lists" built into your browser. Is this one of the reasons that Google recently hired Ben Goodger, the lead developer of Firefox?

Snooping into my favorites and cookies on my machine seems like a bit more than I want Google doing on MY machine. It strains the limits of privacy as well. We can stop sites from serving us cookies, but can't stop who reads them? Ouch!

Further, they reference user's browser cache files as a method of determining value of a site. "For example, the "temp" or cache files associated with users could be monitored by search engine to identify whether there is an increase or decrease in a document being added over time. Similarly, cookies associated with a particular document might be monitored by search engine to determine whether there is an upward or downward trend in interest in the document." Apparently they can see this info, but I'd like them to stay out of my cache and cookies too!

It appears to apply further penalties to new sites by keeping them poorly ranked for even longer periods and applies an apparently new item to algorithms not seen or (at least discussed publicly) of long term purchase of domain names and historical data related to IP address and hosting company! Here's the snip about that longevity of domain registration to ranking:

"[0099] Certain signals may be used to distinguish between illegitimate and legitimate domains. For example, domains can be renewed up to a period of 10 years. Valuable (legitimate) domains are often paid for several years in advance, while doorway (illegitimate) domains rarely are used for more than a year. Therefore, the date when a domain expires in the future can be used as a factor in predicting the legitimacy of a domain and, thus, the documents associated therewith."

I'll be extending the term of my domain registrations ASAP! What a boon to registrars if that element of ranking becomes as valued as linking has been! Everyone will get 10 year registrations if they want to rank well. The domain name aftermarket will also be changed dramatically if this becomes as important as this element makes it appear to ranking. People will buy and sell domains when disposing of them rather than simply letting them expire at the end of the registration period, as most do now.

It appears they will be penalizing domains "associated" with "illegitimate" domains. Hopefully they have a method of determining that it isn't a competitor linking to your domain from their "illegitimate" domain! That suggests they will be able to eliminate "Domain Scrapers" that have been known to scrape search engine results of high ranking domains and posting those on "illegitimate domains" which in effect drags down the ranking of those previously highly ranked domains. How odd the search world is sometimes!

Altogether, it seems that older content will suffer overall because it hasn't changed, because nobody new is linking to it and because it will lose links over time. What if you are posting a historical document that you can't change or an authored piece that is copyrighted? Does it decrease the value of the information? Hmmmm. I guess links would continue to increase if the information remains valuable, so there is some protection in that. But older site content may be unchanged because it is popular, not because it is stale - that's an odd Catch-22.

The anchor text issue discussed in this patent application suggests that "[0118] Unique Words, Bigrams, Phrases in Anchor Text " are significant in determining rank, because if natural links develop, they would vary when webmasters link to a document differently, some would use the URL and embed the link in that, others would use requested text from the webmaster if it were a link request that successfully garnered a link and still others might simply use Google's own Blogger "Blog This" link which simply takes the page title. (I routinely change link text generated by "Blog This" in my blog posts to emphasize the topic discussed and eliminate business/publication names usually added ahead of the topic of the page.)

The US Patent office has a link to images including illustrations and figures that are linked to the filing but they are absurdly large and don't fit in the viewable framed window. This is silliness. Do they mean to hide it by making it unviewable?

I'll attempt to post a smaller version of images on my blog.

The final notable item seems to me to be the clickthrough data that Google sees to sites from their own search results. They will rank site higher that get significant clickthrough rates from the Google SERP's.

"Google may monitor the number of times that a document is selected from a set of search results and/or the amount of time one or more users spend accessing the document. Search engine may then score the document based, at least in part, on this information."

How will they know how long I spend accessing the document unless they can monitor my actions AFTER I've left the Google SERP's to visit the linked site? Wonder what's at work in that? Do they have some way of tracking our actions after we leave their site? I wonder if this has anything to do with the Google acquisition of Urchin traffic statistics company last week.

Well, it's back to work for now, but it will be interesting to see where this patent application is discussed in forums and SEO blogs over the coming week.

Mike Banks Valentine is a Search Engine Optimization Specialist and blogs about the search world at: http://RealitySEO.com while operating a small business ecommerce tutorial at: http://website101.com

RELATED ARTICLES


Search Engine Wars - Quality Searches Vs Quantity!
It is no secret that Google and Yahoo are on a continuous battle to win our hearts and get everyone to convert, but is converting someone really a matter of the quantity or the quality?
Getting To Know Google
Having greatly benefited from my relationship with Google in the past several years, I am dedicating this article to the search engine superstar.
Why Optimize Your Site For Search Engines?
Sometimes a search engine optimization company will miss that glaring question posed by potential clients and assume the benefits of search engine optimization are obvious to everyone. While shelling out a couple thousand on an SEO campaign is common sense to some, others may find it hard to part with the cash unless they know it is an investment in their business that is sure to bring a good return.
Search Engine Optimization Tips For 2005 - Part Three
Welcome to part three of our series of articles on search engine optimization. In the third and final part of our series of articles on search engine optimization we cover the topic of links, the types of links and what makes them so important.
Are You Making These Deadly SEO Mistakes?
Black Hat SEO: Web Spamming and Linking to Bad Neighborhoods
Design A Spider Friendly Site
To be successful in the search engines it's important to design your web site with the spiders in mind. Using the latest in web page design is not generally the best way to go. Spiders don't view web pages like humans do, they must read the HTML in the page to see what it's about. Below you will find tips on how to best design your web site with search engines in mind.
SEO Expert Guide - Keyword Analysis (part 3/10)
If you imagine that building an optimized site is like cooking a meal, then keywords are the essential ingredients. Would you attempt to cook a complex new dish without first referring to a recipe? Would you start before you had all the ingredients available and properly prepared?
How Ive Maintained 7 Top Ten Google Rankings For Nine Months
Back in November 2004 I discovered a way to get a top 10 ranking in Google. I tested the technique for 3 months before I shared my findings with the world.
How to Get Listed in Yahoo Within 48 Hours, Without Paying $299
First of all you need to get a blog. If you don't have one yet, you can get one for free by going to www.blogger.com
How to Prevent Duplicate Content with Effective Use of the Robots.txt and Robots Meta Tag
Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.
Complete Web-Site Optimization For Search Engines (Part 1)
SEO or search engine optimization strategy now becomes widely popular among online business operators. Nothing strange about it as it allows to substantially increase your gross income, as a result of growing traffic or visitors flow.
SEO Expert Guide - Paid Site Promotion (Marketing) (part 7/10)
In parts 1 - 6 you learnt how to develop your proposition, identify your key words and optimize and promote (for free) your site and pages. You were also introduced to our mythical Doug (who sells antique doors, door handles, knockers, door bells or pulls and fitting services) in Windsor in the UK.
Search Engine Metrics: Organic Search vs. Paid Placement
Let me preface this report by citing advertisers in 2004 have spent 4 Billion dollars on search engine marketing according to the Search Engine Marketing Professional Organization (SEMPO).
The Great Search Engine War, Where Content is King
When search engines first appeared, they were simple affairs consisting of a relatively basic database containing small amounts of information about websites. The search engine database allowed web-surfers to search for specific words or phrases. The search engine would then provide a list of hyperlinks to websites containing those words or phrases in several Search Engine Results Pages (SERPS). The basic concepts of a search engine are still the same, but much has changed since those innocent days.
Screwed: Is this an inevitability in the SEO World?
By about 2pm everyday, each of my team members has spoken to a good handful of clients and potential clients who have been speaking with other SEO firms. This an absolutely wonderful thing to see, as in the past in our industry, not enough of our consumers were questioning what they were purchasing. It is a sign that accountability will come and the bad guys will be weeded out.While this is a good sign, it's the cause of my having to answer the same questions over and over. The consumers in the SEO world are being fed out and out lies by some of the people who call themselves experts in the area of Search Engine Optimization. They hear these lies and while comparing prices, contact us at Abalone Designs. They then proceed to tell me everything that all of these other companies promised them and I am utterly astonished. Here some of the most asinine claims I hear through the grapevine."We can guarantee your rankings" Don't be fooled! Ask the company what you will be ranking for, immediately! 9 times out of 10, a company that guarantees you rankings, is guaranteeing that you will rank for your own company name, which means people on Google or MSN or Yahoo! would have to know your company name before searching. How does this produce new customers and visitors to your site? Chances are, as soon as these search engines index your site, you will rank within the top ten for your company name, if not first, because it is unique. Why bother paying someone for something that is already going to happen, anyway? Guaranteeing rankings is highly unethical. It is impossible to guarantee rankings unless you have access to Google's database itself, and even then I'm not sure it's possible. Keep in mind, we are working with a 3rd party, here. A highly guarded 3rd party that doesn't, under any circumstances, reveal it's secrets. No one outside of the companies that run these search engines knows what it is exactly that makes search engines rank sites high. Especially due to the fact that these search engines and the rigorous ranking filters they use to spit out search results change almost monthly. Even a former Google employee doesn't know how to guarantee rankings! If someone is telling you they'll guarantee top rankings, run fast! Those are some shady, shady claims. Google themselves have said:"No one can guarantee a #1 ranking on Google - Beware of SEO's that claim to guarantee rankings, or that claim a "special relationship" with Google, or that claim to have a "priority submit" to Google. There is no priority submit for Google. In fact, the only way to submit a site to Google directly is by using the page at http:// www.google.com/addurl.html. You can do this yourself at no cost whatsoever." - http:// www.google.com/intl/en/webmasters/seo.htmlAn ethical SEO company will not guarantee rankings. They will guarantee that their methods follow search engine guidelines, and they will guarantee customer satisfaction, but at no point in time will any SEO company with a conscience guarantee your rankings."Your site needs to be continually resubmitted to get on and stay on the search engines" When will I see the end of this one? How old is this method now? 5 years? 10 years? We're talking about the days when Webcrawler was the biggest search engine and all computers were beige! This claim is so fully untrue, had Pinocchio uttered it, his nose would have stretched from Rome to Poughkeepsie. And the good folks at Google will once again back me up on this one:"Submission is not necessary and does not guarantee inclusion in our index. Given the large number of sites submitting URLs, it's likely your pages will be found in an automatic crawl before they make it into our index through the URL submission form. We DO NOT add all submitted URLs to our index, and cannot predict when or if they will appear." - http:// www.google.com/intl/en/webmasters/1.html#A2"Meta tags are not important anymore"Sure they aren't. If you don't want a decent ranking on MSN. The new MSN search places a lot of value on the keywords and description meta tags. Without these tags in your site's code, your ranking on MSN will suffer. Just as importantly, if your keywords and description meta tags don't use proper language, your rankings will suffer. The description tag is also what MSN uses as the visible description for a site in the search results. And of course, to prove I'm not the one blowing hot air, here is what MSN themselves say about it:"Site descriptions are extracted from the content of your page each time MSNBot crawls your site and indexes its pages... ...the best way to affect your site description is to ensure that your web pages effectively deliver the information you want to see in search results." - Click here to see the page this is found on. "Your Web Site Has Been Sabotaged" This one is truly unreal. I can't believe it's even been used as an excuse for why an SEO company hasn't achieved decent rankings for you. But alas, more than one SEO company has told potential clients of ours that the reason they are not ranking well, or why their search engine optimization campaign is not effective, is because someone else has been sabotaging the site. Some of the clients who have been told this are small businesses, like bed and breakfasts or pet sitters. We always explain to these potential clients that the likelihood of someone even having the initial idea to sabotage a web site, the site in question would have to be a fairly large one, and the target of a lot of hatred. Why? Because sabotaging a web site's rankings takes a massive amount of time and energy. We're talking months, maybe even years of hard, hard work. Why would anyone devote months or years of their life to taking down a pet sitting site? Or a bed and breakfast?Once again, these are some hefty claims and it is a clear sign that the company who is running your SEO campaign is unwilling to be held accountable for their actions or lack thereof.Don't Put Up With It. The bottom line is, your search engine optimization company works for you. You are paying them. Hold them accountable as you would any other vendor. Keep reading these articles, read info at the search engines, educate yourself and if something your SEO says smells a little rotten, don't be afraid to call them on it.Is being screwed an inevitability in the SEO world? Damn near. But thanks to the increasing interest of our consumers in self-education and their increased questioning, our industry will slowly climb out of the gutter and someday down the line, send this article into antiquity. In spite of my pride, I'd be overjoyed to see that day come.
How Search Engines Work
Before anyone can start optimizing a web site, you must understand how search engines work.
How to Improve Your Search Engine Positioning and Increase Traffic Today
Every website has times when traffic is higher than others. However, in the downtimes you need to figure out why your traffic is lower and what you can you about it. The following suggestions have been proven to increase website traffic and will be effective in getting you more customers. However, before implementing these tips into your website promotion plan make sure you have a clear understanding of how to perform them effectively because if you are not aware of how to do something right, it could possibly backfire and work against you.
Types of Links
Internal Linking ? An introduction
Search Engine Optimization for Beginners
If you are confused about terms like "search engine optimization" or having a "search engine friendly" site, then listen up! I am here to help.
How To Start An Internet Business ? Meta Tags and Keyword Density
Okay, you have a domain name, layout and content. Now we get to a step that will go a long way to determining how the site will rank. Yes, we are going to focus on two infamous topics, meta tags and keyword density.