YOUR IT - Technology for you

No. 1 Story

Telstra adds one million mobile services, but Sensis plummets

Telstra has revealed the addition of almost one million new mobile services in the six months to December 2011, but Sensis revenues plummeted 24 percent in 12 months.

read more

Google index grows to one trillion pages

Your IT - Home IT

The current index would be even bigger than that astonishing 1 trillion number if Google did not actively filter out the multiple URLs with exactly the same page content. "Even after removing those exact duplicates, we saw a trillion unique URLs" Alpert and Hajaj say, adding "the number of individual web pages out there is growing by several billion pages per day."

The truth is that nobody knows exactly how big the web is or how many absolutely unique pages it contains. It can only ever be a best guess metric because even Google has to admit it simply does not have the resources or time to look at them all.

"Strictly speaking" Google says "the number of pages out there is infinite." By way of example it offers the case of web calendars which often incorporate a link to 'the next day' activities. If Google followed these, it argues, it would be stuck in a forever search loop. "We're not doing that, obviously, since there would be little benefit to you."

In fact, Google did not index every one of that trillion pages claim either because many of them are reported to be very similar to each other, or contain auto-generated content that is not if much interest to the general web searching public.

Google does claim to have the most comprehensive index of any search engine however, and we have no inclination to argue with them there. But imagine just how much better it could be if it were to index the so called Deep Web.

Back in the year 2000, when the Google index hit a billion pages remember, a University of Michigan study was claiming that the Deep Web contained something in the region of 550 billion individual documents.

Do the math on that to take account of the new 1 trillion pages Google index figure, and that's what we call really big...

Loading comments ...



- sponsored feature -

The Death of Traditional BI: What’s Next?

How to Make Business Discovery Work for Your Business IP PABX BUYING GUIDE

Business Discovery takes its cues from consumer apps. Like Google, it encourages us- ers to hunt for and explore data without worrying about or even noticing the underly- ing technology. Their entire experience is working within an intuitive interface to get real-time, self-service results with only minimal training. ...more