Google Caffeine Roundup

by admin on September 30, 2009

So September has arrived and, as yet, there has been no announcement telling us when the Google Caffeine update will go live. For anyone who has been on vacation for the past month or so, ‘Caffeine’ is the code name for a revised architecture of Google’s web search.

Matt Cutts has confirmed that the Caffeine search infrastructure is built on top of a complete overhaul of the company’s custom-built Google File System (GFS). In fact, one of the things that Caffeine relies upon is next-generation storage. Matt Cutts has described the Caffeine update as the most significant change to Google search since the Big Daddy update in late 2005, early 2006.

For a number of weeks we’ve been allowed to checkout Caffeinated search using the sandbox facility provided by the big G. I’ve been conducting a few test searches every now and then and my brief assessment is that Caffeine is indexing onsite changes and fresh content far more quickly than the current de-caffeinated version of Google, which is clearly a good thing.

Here are a handful of related posts:

Firstly there is the Matt Cutts blog (More info on the Caffeine Update) where you can view an interview with Mr Cutts, take a look at his non-existent hair and see that he is not responding to questions asking when Caffeine will go live. There are some interesting observations noted in the comments like the one from Tom Forrest who noticed that some sites with a high number of indexed pages in the current Google have significantly fewer pages indexed by Caffeine.

There is an interesting post over at the Register (Google Caffeine: What it Really Is) that confirms how Caffeine is fundamentally a re-architecting of how Google indexing works. It is supposed to speed up the process of indexing documents and making them available to search. Mr Cutts has apparently confirmed that Caffeine is built on top of the completely overhauled Google file system, refered to as GFS2.

One thing that is particularly notable about Caffeine and the way that it is being introduced is how Google have invited feedback from people they refer to as ‘power users’. As highlighted in this post from davidnaylor.co.uk (Google Caffeine and SEO), inviting feedback from users before going live with an update is not the approach previously taken by the big G. And they have made it easy to submit comments and feedback by providing us with a ‘Dissatisfied? Help us improve’ link at the bottom of the SERPs.

Some are suggesting that the Caffeine update is intended to capture more ‘real time’, relevant search results. But this post from Marketing Pilgrim: Google Caffeine Test Suggests too much Emphasis on Real Time Indexing; has suggested that Caffeine is more about speed, perhaps at the expense of relevancy.

But some detailed testing carried out by Mashable (Google Caffeine: A Detailed Test of the New Google) has suggested that: Caffeine presents results faster than current Google and that there is an algo change that makes keyword strings more important.

There are many observations that the Caffeine search results are actually pretty much identical to the current Google results. E.g. Google Caffeine: Newsier by a Nose. But that just confirms what Matt Cutts has said: that Caffeine is essentially about re-architecting or rewriting the way that Google indexes web pages. Its all about crawling really thoroughly, indexing accurately and serving results really fast. Roll out is expected to take place, gradually, across multiple data-centres once quality evaluation criteria have been verified.

Leave a Comment

Previous post:

Next post: