Technology news and Jobs arrow Information Technology News arrow Google index grows to one trillion pages
Google index grows to one trillion pages E-mail
by Davey Winder   
Sunday, 27 July 2008
The current index would be even bigger than that astonishing 1 trillion number if Google did not actively filter out the multiple URLs with exactly the same page content. "Even after removing those exact duplicates, we saw a trillion unique URLs" Alpert and Hajaj say, adding "the number of individual web pages out there is growing by several billion pages per day."

The truth is that nobody knows exactly how big the web is or how many absolutely unique pages it contains. It can only ever be a best guess metric because even Google has to admit it simply does not have the resources or time to look at them all.

"Strictly speaking" Google says "the number of pages out there is infinite." By way of example it offers the case of web calendars which often incorporate a link to 'the next day' activities. If Google followed these, it argues, it would be stuck in a forever search loop. "We're not doing that, obviously, since there would be little benefit to you."

In fact, Google did not index every one of that trillion pages claim either because many of them are reported to be very similar to each other, or contain auto-generated content that is not if much interest to the general web searching public.

Google does claim to have the most comprehensive index of any search engine however, and we have no inclination to argue with them there. But imagine just how much better it could be if it were to index the so called Deep Web.

Back in the year 2000, when the Google index hit a billion pages remember, a University of Michigan study was claiming that the Deep Web contained something in the region of 550 billion individual documents.

Do the math on that to take account of the new 1 trillion pages Google index figure, and that's what we call really big...
Powered By Joomla Tags

Please enable JavaScript in your browser to post your comment!



 
< Next story in category   Previous story in the category >
iTWire user statistics Visitors last 30 days
694,279
Subscribers 15,210
#1 independent technology news advertise here
  •   *  
  • Search
  • AdvSeach
  • Login
  • Events
  • FreeStuff

- Advertisement -

Featured Whitepapers

Follow iTWire on Twitter

About iTWire

iTWire is all about technology news, information, jobs and community for the IT and telecommunications industry professional. Subscribe to our free ICT daily newsletter