Technology news and Jobs
Information Technology News
Google index grows to one trillion pages
Information Technology News
Google index grows to one trillion pages | Google index grows to one trillion pages |
|
| by Davey Winder | |
| Sunday, 27 July 2008 | |
|
Page 2 of 2 The current index would be even bigger than that
astonishing 1 trillion number if Google did not actively filter out the
multiple URLs with exactly the same page content. "Even after removing
those exact duplicates, we saw a trillion unique URLs" Alpert and Hajaj
say, adding "the number of individual web pages out there is growing by
several billion pages per day."Featured Whitepaper
5 Best Practices for Smartphone Support
"Strictly speaking" Google says "the number of pages out there is infinite." By way of example it offers the case of web calendars which often incorporate a link to 'the next day' activities. If Google followed these, it argues, it would be stuck in a forever search loop. "We're not doing that, obviously, since there would be little benefit to you." In fact, Google did not index every one of that trillion pages claim either because many of them are reported to be very similar to each other, or contain auto-generated content that is not if much interest to the general web searching public. Google does claim to have the most comprehensive index of any search engine however, and we have no inclination to argue with them there. But imagine just how much better it could be if it were to index the so called Deep Web. Back in the year 2000, when the Google index hit a billion pages remember, a University of Michigan study was claiming that the Deep Web contained something in the region of 550 billion individual documents. Do the math on that to take account of the new 1 trillion pages Google index figure, and that's what we call really big... |
| < Next story in category | Previous story in the category > |
|---|


Tags




