Google Caffeine: Google’s New Way of Indexing Web Pages

Google Caffeine is Google’s new way of indexing Web pages. It works by crawling the Web in a more efficient manner. Caffeine-enabled Google is now able to provide Web searchers with fresher results 50% more than that available during a previous Google search. So what makes this work?

The Working behind Caffeine

The reason behind this new development is the way the Web is indexed. Earlier, indexing of the Web was carried out in regular intervals – initially every four months, then from year 2000 every month, and till recently each night. The earlier long intervals were caused due to the fact that the Web was crawled in parts and then indexed. Total updating would take up to 10 days, Matt Cutts of Google revealed.

The drawbacks in this system were that there was a time gap between crawling and finding new information on the Web and then indexing it. Secondly, since the Web was indexed in layers, rather than globally, not all parts were updated together. As a result searches were hitting a Web index that was a bit old. While pages were being added and more websites published as an individual searched, these would appear in later indexes which could only be accessed in later searches.

Now with Caffeine, the Web is crawled continuously in parts and globally, and not in specific areas as was the case before. New information is indexed immediately in a process that carries on throughout rather than at regular intervals. Hence, updated pages and new websites will find their way to the Google Web index as and when they appear. Users searching can access the Web as it was only seconds or minutes before. It is almost equivalent to getting through to the live Web. Google’s new way of indexing pages brought about by the Caffeine architecture makes real time search a reality.