Google introduces new search indexing system - Caffeine


Prime VIP
Google announced on Tuesday the completion of a new web indexing system called Caffeine.

Google claims that Caffeine provides 50 percent fresher results for web searches than their last index. "It's the largest collection of web content we've offered. Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before", said Carrie Grimes, Software Engineer at Google.

Google has built a new search indexing system as web content is growing in size and is more diverse in the form of video, images, news and real time updates. To keep up, Google has built Caffeine. The old index had several layers (see image above) that updated at different rates to others. To refresh a layer of the old index, Google would analyze the entire web which meant there was a significant delay between when Google found a page and when it was available on its search pages for end users.

With Caffeine, Google analzyes small portions of the web and updates the index on a continuous basis. Every second, Caffeine processes hundreds of thousands of pages in parallel. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds information at a rate of hundreds of thousands of gigabytes per day. Google claims you'd need 625,000 of the largest iPods to store that much information.

Users who search on should see improvements in the months to come.

Official Google Blog: Our new search index: Caffeine