Jump to content

Cre8asiteforums

Web Site Design, Usability, SEO & Marketing Discussion and Support

  • Announcements

    • cre8pc

      20 Years! Cre8asiteforums 1998 - 2018   01/18/2018

      Cre8asiteforums In Its 20th Year In case you didn't know, Internet Marketing Ninjas released many of the online forums they had acquired, such as WebmasterWorld, SEOChat, several DevShed properties and these forums back to their founders. You will notice a new user interface for Cre8asiteforums, the software was upgraded, and it was moved to a new server.  Founder, Kim Krause Berg, who was retained as forums Admin when the forums were sold, is the hotel manager here, with the help of long-time member, "iamlost" as backup. Kim is shouldering the expenses of keeping the place going, so if you have any inclination towards making a donation or putting up a banner, she is most appreciative of your financial support. 
Sign in to follow this  
yannis

Billion of pages gone in Google?

Recommended Posts

If you do a search in Google with * * it will usually return all its pages in its index. It is normally 25,270,000,000. On some of the Data Centers like this it only shows

20,960,000,000. Can this explain the disappearance of a lot of pages from a number of websites?

 

I will be interested to hear what Google shows in other areas of the world.

Share this post


Link to post
Share on other sites

Supposedly, this is the effect of the Big Daddy update. Or they are moving data between datacenters.

 

Here is what Matt Cutts had to say about the update.

 

In short, BD seems to be renewing their index and removing sites that don't have natural linkage.

Don't think there's something to worry for white hat SEOs anyway.

Getting quality content to get relevant incoming links and traffic seems to be the only way to stay afloat here.

 

 

Afterthought afterthoughtJust checked myself. I see 25.257bil pages in the index. Just the datacenter I suspect, then. Or my datacenter hasn't been updated. Either of the two :) Edited by A.N.Onym

Share this post


Link to post
Share on other sites

Hmmm, must be doing something wrong. I tried searching

 

* *

 

*.*

 

and still didn't get any results.

Share this post


Link to post
Share on other sites

Those numbers are so approximate that I wouldn't even bet they're 20-30% correct :). With Googles datacenter setup, these kinds of differences could easily happen - and BigDaddy seems to have a strong effect on many sites. They seem to have fiddled with the parameters a bit and have managed to pull some legitimate sites back in, I wouldn't be surprised if they were turning other spam-related parameters back up on some datacenters. Constantly tweaking :)

 

John

Share this post


Link to post
Share on other sites

Yeah, I have noticed that Google results numbers are slightly exaggerated (by about 20-30%, too).

 

Btw, its ** or -site:www.google.com to see the numbers, displaying the index size.

Share this post


Link to post
Share on other sites

The * is used more or less like a wildcard. If you do a search for 'Search * Optimization' it will return results such as 'search engine optimization', 'search engine positioning' etc. Try this if you searching only for blogs and it will return a Server Error! (Actually I enjoy seeing a Google Error, so please do not report it!). This immediately points to me that Google treats blogs differently from websites and that it uses a different algorithm for both ranking as well as positioning of blogs!

Share this post


Link to post
Share on other sites

I personally would be iffy about providing that as the evidence of the number of indexed pages. It could just as easily be the number of references, which may include stuff like 301 redirects (they still need an entry or a reference), 404 errors (they need to track URLs NOT to check anymore) and other stuff they may have entries for (like banned pages).

Share this post


Link to post
Share on other sites

Projectphp thanks for the reply. What I was more interested and that is why I started this thread was the large discrepancy in pages shown in the index from different Data Centers, not the actual number. I think that Google propagates their index over the 50 or more Data Centers over a period of months not days. I can be wrong but a 25 % discrepancy between Data Centers is large. Another explanation is technical problems at certain data centers. The 25 billion or so are actually an estimate of pages in the index and searching with * * just helped to get this estimate.

 

The question remains why are these large discrepancies?

Share this post


Link to post
Share on other sites

I don't know what the number means, so why it changes is going to be even harder to guess at, wouldn't you say?

Share this post


Link to post
Share on other sites

Sure! Google est QUISQUE COMOEDUM! Off to have a nice breakfast and watch the rugby!

Share this post


Link to post
Share on other sites
Guest joedolson

Using SEO Chat's multiple datacenter tool, these 3 datacenters produce the lower number:

 

64.233.167.99 - 20,960,000,000

64.233.167.147 - 20,960,000,000

64.233.167.104 - 20,960,000,000

 

As for what this means, who knows? Could be as simple as that they're testing an algorithm which doesn't count all of the references projectphp mentions. The SERPs don't appear to be very different - at least not in the top 10 results, so I wouldn't bother myself about it too much.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

×