Google index grows to one trillion pages

Home IT

File under really big stuff. Google search engineers are reporting that the Google search index has grown from 26 million pages when it first launched back in 1998, to a whopping one trillion unique URLs today...

According to two software engineers from the Google web search infrastructure team, Jesse Alpert and Nissan Hajaj, the web is officially big. Very big indeed.

Apparently the search engineers at Google "stopped in awe" recently when they realised just how big the web has become. Or at least in respect of the number of unique pages that are included within the search engine indexing system.

"Our systems that process links on the web to find new content" Alpert and Hajaj report "hit a milestone: 1,000,000,000,000 unique URLs on the web at once!"

Google finds these pages by following all the links from an initial set of pages, arriving at those new pages and then following the links from there. This spidering process is forever uncovering new content. Indeed, when Google first launched in 1998 it managed to find some 26 million pages even back then.

The magic one billion pages milestone fell just two years later in 2000.

"Back then" Alpert and Hajaj recount "we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google's index for a fixed period of time."

Things have changed, and now Google downloads the web on a continuous basis. Forever re-processing the entire web-link graph "several times per day."

This is like doing the computational equivalent of mapping every intersection of every road in the United States, and doing so several times every single day. "Except it'd be a map about 50,000 times as big as the US" the Google engineers reckon "with 50,000 times as many roads and intersections."

So how big is the web really, and how quickly is it growing? Read on for the Google search engineers verdict...

CONTINUES



SPONSORED PRESS RELEASES

Websense Security Labs Reports ‘User Trust’ Targeted Attacks; Over 1 in 10 ‘Top Search’ Results Categorised as Malware; Increased Focus on Web 2.0
Websense, Inc. today revealed the findings from its bi-annual research report: Websense Security Labs, State of Internet Security, Q3-Q4 2009.

Featured IT jobs

Senior Software consultant responsible for providing support on a unique enterprise level software solution for various customers, Melbourne based!
Skills Tags:   IT  ITIL  Linux  Management  RFP  Unix
This financial client has an excellent opportunity for an experienced Database Developer. SQL 2005 Some Schema design + SSIS & SSRS - 80k+super
Skills Tags:   Design  Development  SQL  SQL Server
Massive Hyperion Project requires a Hyperion Planning Architect / Lead Developer - drive home a huge Hyperion solution.
Skills Tags:   Architect  Design  Development  Hyperion
OBIEE Consultant to work on a very large greenfield OBIEE implementation to date to work end-to-end with excellent modelling & BI Server skills
Skills Tags:   Business Intelligence  Cognos  Hyperion  Informatica  Oracle  SQL

Editors Picks

Stories you may have missed 

What iTWire offers for free

E - mail News SMS Headlines Desktop Alerts News Feeds Job Alerts Technology Events Press-Releases