Jump to content


Web Site Design, Usability, SEO & Marketing Discussion and Support

  • Announcements

    • cre8pc

      20 Years! Cre8asiteforums 1998 - 2018   01/18/2018

      Cre8asiteforums In Its 20th Year In case you didn't know, Internet Marketing Ninjas released many of the online forums they had acquired, such as WebmasterWorld, SEOChat, several DevShed properties and these forums back to their founders. You will notice a new user interface for Cre8asiteforums, the software was upgraded, and it was moved to a new server.  Founder, Kim Krause Berg, who was retained as forums Admin when the forums were sold, is the hotel manager here, with the help of long-time member, "iamlost" as backup. Kim is shouldering the expenses of keeping the place going, so if you have any inclination towards making a donation or putting up a banner, she is most appreciative of your financial support. 

Google Code Search

Recommended Posts

Google has just made Code search public -- pretty slick. Finally something for the programmers :) (but real programmers never have to look things up anyway, ha ha).


However, there are a few things which might influence web search as well:


  • Google searches inside of "zip" files (and tar.gz, jar, etc).
    That means that if you have content which should not be indexed, it is not safe to just place it within zip-files. On the other hand, it will index content from your zip files.I wonder how it handles password protected zip files? or broken zip files (like those used to attack antivirus solutions that unzip files to check them)?
  • Google extracts information from your code, like which license is uses, which language it's written in
    Hmmm, where does it get that information from? Probably pattern matching for known license texts.
  • Google might be indexing your javascript code after all
    No more hiding stuff in javascript because Google doesn't index it. I wonder how this is applied to pages that just use external javascript files (compared to those that explicitly link to them for a download or include them in a zip-file)


One fairly problematic issue I see with a tool like this (and I'm sure they've thought of it as well) is that you can now easily search for known issues with open source (or for that matter: any indexed) code. Say there is a known exploit when a script uses certain functions in predictable ways: you can now search for that (using regular expressions), find out where it's used, and exploit the scripts. Sure this was possible before, but you would have needed to download all those scripts and done the search manually on your own system. Now you can search all indexable scripts within a few seconds.


How do you rank for code-search? Since your code is usually only linked from very few places within your own site (and hardly ever from the outside directly) I expect the influence of your own sites general value ("PR" if you will) is a strong factor. Within the code it's hard to determine important sections (no headers, no bold, etc.) but perhaps they take the frequency? How do they determine if a piece of code is relevant for your search term or not?


How do you make sure that your "current" code is indexed and perhaps the older versions are removed? How do you keep Google from indexing your "bad examples"?


Fun stuff. Finally something for the geeks among us :huh:. Hey - look, someone used my code snippets with my original comments in them :D! No more easter eggs .. :(



Share this post

Link to post
Share on other sites

Oops, now it's also all too easy to stumble upon confidential code which is accidentally online ... I wonder what it takes to get your snippets out of Google Codesearch ....




PS what do you do when you see "FINDERS ARE ASKED TO DESTROY THIS DOCUMENT" :huh:?


PPS would it make sense to post vulnerable queries here, keep them to myself or try to find someone at Google who can block them (and how)?

Edited by softplus

Share this post

Link to post
Share on other sites

I like the way they have the prominent message


Search public source code.


Which is basically saying that if it is on the www and not blocked by robots.txt then you obvisoulsy intend the information to be 'public' for everybody.

Share this post

Link to post
Share on other sites

Oh, great another thing to worry about.

Share this post

Link to post
Share on other sites

Even being a programmer, not sure how much I would utlise that.


However like FP_Guy said


"Oh, great another thing to worry about."

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now