What, overworked? or just underpowered, underbandwidthed
. It's good to run things like this on a local server, then you can play with lots of variables, which however does sometimes leave the rest of the system to beg for resources
. And one day I'll get a really-good web connection for the company ....
The main problem with a tool like this is that Google turns on a block when it sees lots of queries from the same server -- the tool more or less goes against the Google terms of service, which is not really such a good idea, but is the only way to get this data. Personally, I feel that I'm not doing automated queries since I do it in real time with the user triggering them - but that argument is probably moot since Google only sees a flood of queries from the server
. I could trick them a bit more (perhaps spread the requests over several IPs or go through proxy servers) and possibly get a few more queries out of them before they "recognize" it again, but then again if they need / want to block mass queries then I'll just take what I can get until then. Perhaps I should mention that on the pages in question, so that people understand what it means when it returns "?"
What surprised me was that the fluctuations in supplemental URLs across the datacenters was much higher than the fluctuations in indexed URLs. To me that sounds like a sign that they are playing with the supplemental index, with perhaps vastly different settings. Also, the sometimes large difference in the actual count and the "about" count (my "bad data push correction"
) seems much higher with supplementals than with indexed pages (or links) - that would make sense however, since the supplemental index is probably not something they would give a higher priority to generate better approximations for.
The "Flux-Factor" is a rough and dirty value I calculate to determine how large the spread in numbers is. A high "flux-factor" could signify that things are in movement and that no stable equilibrium has been reached. It could also signify that Google is using / testing different settings (as I think is the case with the supplementals).
A fun toy with interesting numbers, you just have to figure out what they mean