Davey Winder
Sunday, 27 July 2008 05:06
Your IT -
Home IT
Page 1 of 2
File under really big stuff. Google search engineers are reporting that the Google search index has grown from 26 million pages when it first launched back in 1998, to a whopping one trillion unique URLs today...
According to
two software engineers from
the Google web search infrastructure team, Jesse Alpert and Nissan
Hajaj, the web is officially big. Very big indeed.
Apparently the search engineers at Google
"stopped in awe" recently when they realised just how big the web has
become. Or at least in respect of the number of unique pages that are
included within the search engine indexing system.
"Our systems that process links on the web to find new content" Alpert
and Hajaj report "hit a milestone: 1,000,000,000,000 unique URLs on the
web at once!"
Google finds these pages by following all the links from an initial set
of pages, arriving at those new pages and then following the links from
there. This spidering process is forever uncovering new content.
Indeed, when Google first launched in 1998 it managed to find some 26
million pages even back then.
The magic one billion pages milestone fell just two years later in 2000.
"Back then" Alpert and Hajaj recount "we did everything in batches: one
workstation could compute the PageRank graph on 26 million pages in a
couple of hours, and that set of pages would be used as Google's index
for a fixed period of time."
Things have changed, and now Google downloads the web on a continuous
basis. Forever re-processing the entire web-link graph "several times
per day."
This is like doing the computational equivalent of mapping every
intersection of every road in the United States, and doing so several
times every single day. "Except it'd be a map about 50,000 times as big
as the US" the Google engineers reckon "with 50,000 times as many roads
and intersections."
So how big is the web really, and how quickly is it growing? Read on for the Google search engineers verdict...
CONTINUES