Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


Photo

GoogleBot signs up


  • Please log in to reply
10 replies to this topic

#1 travis

travis

    Sonic Boom Member

  • 1000 Post Club
  • 1532 posts

Posted 13 February 2006 - 12:36 AM

We had GoogleBot sign up as a customer on our largest e-commerce store today.

Has anyone else noticed GoogleBot attempting to reach into the deeper sections of a website by filling in forms ?

It would appear to be a very clever development.

#2 Nadir

Nadir

    Light Speed Member

  • Members
  • 976 posts

Posted 13 February 2006 - 12:39 AM

yeah, you know what he made a $2000 purchase on my site the other day!!! lol

What do you mean sign up? You mean he registered and enter some information or?

#3 bragadocchio

bragadocchio

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 15634 posts

Posted 13 February 2006 - 01:47 AM

Interesting stuff, Travis.

Were the types of forms being filled in search forms?

Are they trying to get the type of deep web content from your site like described in this paper:

Downloading Hidden Web Content

Regardless, that would be a considerable change of behavior from Googlebot, which normally restricts itself to following HREF links and SRC links.

#4 travis

travis

    Sonic Boom Member

  • 1000 Post Club
  • 1532 posts

Posted 13 February 2006 - 04:01 AM

Cheers Bill,

The GoogleBot has entered a new customer on this page :

http://hp.empr.com.a...tSignup.asp?m=1

This is our largest client with over 100,000 HP parts on sale and a whiz bang internal search engine.

The code detects the type of user-agent, and if its a robot, makes a recording of the information, and limits the robots' usage of the site to only access areas where cookies are not required.

The size of the site would be attractive to Google, and we have been fighting the search engines to keep from flogging our SQL Server to death, and only trawl content that we consider as an acceptable volume of usage in proportion to our customers.

We dont set cookies for search engine robots, for obvious reasons, so they have a different experience to agents who do have cookies.

if inStr(1,uCase(request.serverVariables("HTTP_COOKIE")),"ASPSESSIONID",VbBinaryCompare) = 0 then
  s = lCase(request.serverVariables("HTTP_USER_AGENT"))
  if inStr(s, "ask jeeves")=0 AND inStr(s, "inktomi")=0 AND inStr(s, "netcraft")=0 AND inStr(s, "wisenutbot")=0 AND inStr(s, "webwombat")=0 AND inStr(s, "googlebot")=0 AND inStr(s, "slysearch")=0 AND s <> "mozilla/3.0 (compatible)") then

"Get the navigation ready with the cookies and the https"

else

"Dont Set Cookies because its a search engine. Just let it through the basic parts of the site. Record anything and everything about it and store it in a separate table."


end if

This means that the search engine robots wont be able to access the deeper levels of the database where the majority of parts are. They just run the pages down to Level 3 & 4 and can go no further. Thats as far as the sitemap will let them go.

But in the process, we do make an extensive recording of their activities, and where they go. And this led us to identify the user-agent who filled in the form.

As far as the form is concerned, it looks like javascript was disabled in the process, because not all the fields were filled in as required, just a selection with really generic style information, not something a human would enter.

It could be a spoof, so we will wait and see if that account is accesssed, or if we get any more forms filled in by these robots.

Its a bit like watching someone's bank records to determine where they are.

It all happens after the fact.

Edited by travis, 13 February 2006 - 06:47 PM.


#5 eKstreme

eKstreme

    Hall of Fame

  • 1000 Post Club
  • 3399 posts

Posted 13 February 2006 - 04:48 AM

Check that the IP address of the person who signed up is really Google-owned. Some people love to cruise around the web looking like Googlebot. Some people even blog about their finds.

#6 travis

travis

    Sonic Boom Member

  • 1000 Post Club
  • 1532 posts

Posted 13 February 2006 - 05:50 AM

What is the IP Address range of Google robots ?

#7 eKstreme

eKstreme

    Hall of Fame

  • 1000 Post Club
  • 3399 posts

Posted 13 February 2006 - 06:14 AM

What is the IP Address range of Google robots ?

View Post


Don't know, but if you type in the address into a reverse DNS lookup, you can read the address it is registered to. If it is really Google, the output of the IP address would look like this:

OrgName: Google Inc.
OrgID: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US


Please tell us the result!

#8 JohnMu

JohnMu

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3518 posts

Posted 13 February 2006 - 06:53 AM

Just a short note -- there is no single IP range for Googles crawlers, I've seen them jump in from all over. I also assume they might try geotargeting, sending crawlers that have the lowest cost in terms of routing (closer geographically or nearer to the next big pipe).

I agree with eKstreme though, there are many (some, few) visitors who surf as the Googlebot just to see what happens :-)

John

Edited by softplus, 13 February 2006 - 06:53 AM.


#9 travis

travis

    Sonic Boom Member

  • 1000 Post Club
  • 1532 posts

Posted 13 February 2006 - 11:55 AM

OK,

I did a reverse lookup.

Its Google Boys.

OrgName: Google Inc.
OrgID: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US


Google has signed up for our e-commerce store.

#10 eKstreme

eKstreme

    Hall of Fame

  • 1000 Post Club
  • 3399 posts

Posted 13 February 2006 - 12:44 PM

Very interesting. Can you give us more details about the form?

1. Was it get or post?
2. Did it have to fill in text fields?
3. There wasn't a captcha, was there?
etc etc.

One thing though: can we tell the difference between a Google employee spoofing him/herself as Googlebot and Googlebot proper?

And what exactly was the the User agent? I bet it's the new Mozilla/5.0 one...

Edited by eKstreme, 13 February 2006 - 12:45 PM.


#11 travis

travis

    Sonic Boom Member

  • 1000 Post Club
  • 1532 posts

Posted 13 February 2006 - 06:47 PM

Ekstreme,

This is the form,

http://hp.empr.com.a...tSignup.asp?m=1

The difference between an employee and a robot is sorted out using that code. But more importantly, the generic nature of the data submission was the big giveaway.

It also disabled javascript as not all of the required fields were filled in.

We have never seen anything like it.

If anyone sees that in their e-commerce signup pages, let me know.

If the Googlebot actually logs in, it will be a first for any of our websites. The implications are quite substantial.

Will Google actually log in ?

If it does log in as a customer, and finds different content, which in this case it would, what will it report in the SERP's for that page ? Or would it report a different page.

Are there sites where people offer a free login signup, but dont want that content trawled or displayed by Google ?

Edited by travis, 14 February 2006 - 04:54 AM.




RSS Feed

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users