Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


Google Says Bad Html Code Is Ok

  • Please log in to reply
11 replies to this topic

#1 JohnMu


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3519 posts

Posted 22 December 2006 - 01:16 PM

An interesting comment by Adam Lasnik in the Google Groups thread Is W3C validation really essential for Google to list my site ?

Our Googlebot is amazingly persistent and resourceful and is given antacids each day before he crawls.

Seriously... I don't want to discourage anyone from validating their site; however, unless it's REALLY broken, we're likely going to be able to spider it pretty decently.

It's more important -- from a Google-friendly site perspective -- that your site adheres to our guidelines and is broadly accessible (serverwise, browserwise, platformwise, etc.)

Being more specific: I'm betting that in the vast majority of cases in which folks have indexing or ranking concerns, the core issue is NOT that their site doesn't perfectly validate.

So "really broken" is bad, "accessible" is good, and those who don't know where to look and start to get a page to validate in order to get it indexed are looking in the wrong place :-). But how much is "really broken" and how accessible does it have to be? Are we starting the discussion all over again? :) :)


#2 Wit


    Sonic Boom Member

  • 1000 Post Club
  • 1599 posts

Posted 22 December 2006 - 01:23 PM

No. If Google can index one of your site's pages alright, it can probably index the next as well. Especially if you're using a template.

I always take Google's remarks with a grain of salt, but in this case I'll gladly take it for gospel (because of what I've been saying earlier, on a couple of occasions LOL).

I always like to tell people not to fret and to look at the "evidence". That said, nowadays I always TRY to produce valid xhtml code.

#3 Pittbug


    Ready To Fly Member

  • Members
  • 46 posts

Posted 22 December 2006 - 01:30 PM

The "site must validate" argument (for seo purposes) has really been blown way out of proportion. Yes the w3c tool can show you errors, but googlebot is not going to care if you use align=center or align=middle. It just needs to be able to find your spiderable text. Which means, not putting it in the wrong places like between a tr and td for example.

#4 JohnMu


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3519 posts

Posted 22 December 2006 - 01:38 PM

I think the main problem with the whole validation stuff is that for a newbie, this is almost the *only* thing they can check with certainty for their site. It gives you a number from 0 to close to infinity (been there, done that). If you reduce it towards 0 then you're on the right path.

Everything else is "soft" in terms of SEO. You can't even count your inbound links (at least not on Google :)). There are no other absolute numbers (other than traffic, more or less, and sales, of course) which a newbie can track. To someone getting started, validation is just about the only thing you can check without already knowing and understanding much more of the whole process that it takes to get search engines to crawl, index and rank your site.


#5 BillSlawski


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 15667 posts

Posted 22 December 2006 - 02:25 PM

I look at validation in a couple of ways.

The first is to help me catch any stupid errors that I may have made. It's like having a free proofreader. :)

The second is to help mitigate risk, in case something is really wrong, and it might cause problems with something getting indexed.

And you know, spidering a site "pretty decently" is fine and good, but I don't think that there's anything wrong with wanting to try to do the best you can on a site. There's nothing wrong with aiming towards accessible pages, or pages that look good in different browsers.

I spent a couple of months in 2005 training someone in SEO who had an incredible amount of education and experience in standards based html coding and had been the lead accessibility person on some major federal US government web sites.

What made that a lot of fun was explaining where semantically meaningful uses of standards could make it easier for search engines to understand the contents of pages, and their connections to other pages on the same site.

I recently researched and wrote about how search engines are looking at segmenting pages for handhelds, and it was pretty obvious that having better control over your code and the way it is presented would mean that it would be better presented on a smaller screen through some proxy service though a Google or Nokia, or other potential point of access.

That breaking down of pages, and segmenting them may have implications for SEO in a number of ways, because they may provide opportunities for the search engines to understand pages better. For example, it might be preferable to pull snippets from the main content of a page instead of a footer or sidebar, if there is no meta description or it doesn't contain the keywords used in the query that the page showed up in search results for. The value of links could be calculated differently if they were in headings, or footers, or sidebars, or main content areas. Duplicate content filtering might be more meaningful if it could focus upon the main content section of a page. And keyword or phrase indexing might also be more meaningful if those words were from an content area, then say, a footer.

Sure, even text files will get indexed. But, there's a chance that the text in an <h2> in a sidebar isn't as important to a page as the text in an <h2> in the main content of a page. Is this something a search engine is presently looking at? I don't know.

Here's a blog post I wrote a while back about how Google might handle navigation when trying to take large pages and put them on small screens:

Google Indentifies Navigation Bars for Small Screens, Snippets, and Indexing

Here's a snippet:

The primary focus of this patent is on identifying navigation bars on a page that can safely be re-written or changed in some manner for display on a smaller screen. An integral part of the process involves actually identifying navigation bars. Itís probably important that the patent mentions (briefly) that this identification can be helpful in indexing a page and deciding upon which text to use to provide snippets to searchers, which goes beyond the reauthoring process.

Considering the ways in which search engines may want to manipulate the content of a site, and possibly even rewrite parts of it, I want as much control over the code as possible.

So yes, search engine spiders are forgiving of bad code. But, how much control do you want to turn over to them in their ranking and presentation of your site to others?

#6 JohnMu


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3519 posts

Posted 22 December 2006 - 02:30 PM

And you know, spidering a site "pretty decently" is fine and good, but I don't think that there's anything wrong with wanting to try to do the best you can on a site.

I think that is very important... after all, we're in it to "optimize" the site. We don't want a "good" site, we want a "great" or "the best" site :).

However, for many newbies, it's a matter of getting in or not, and for them I doubt that this is the most important optimization that they are missing :).

Are there any "absolute beginners guide to SEO" out there? Something even a step before Rand's fine guide?


#7 BillSlawski


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 15667 posts

Posted 22 December 2006 - 02:54 PM

I've heard good things about these books:

Gradiva Couzin and Jennifer Grappone's Search Engine Optimization: An Hour a Day

Bill Hunt and Mike Moran's Search Engine Marketing, Inc.: Driving Search Traffic to Your Company's Web Site

Aaron's SEO Book and Dan's Search Engine Marketing Kit also introduce the concepts of SEO at an early level of knowledge, and include advanced topics as well.

#8 whitemark


    Time Traveler Member

  • 1000 Post Club
  • 1071 posts

Posted 22 December 2006 - 03:45 PM

All Google really needs to index webpage is link and content (text and images). How hard can you mess these two up even if you code badly? Sorting and ranking is alltogether a different issue ...

Edited by whitemark, 22 December 2006 - 03:47 PM.

#9 JohnMu


    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3519 posts

Posted 22 December 2006 - 05:17 PM

How hard can you mess these two up even if you code badly?

You'd be surprised :D. The first step is to acknowledge that those two items are important. Really, for the average uninformed webmaster, it is a big step to even get that far.

Ask in the Google webmasters groups "why isn't my site indexed" and you'll get a share of "it takes 3-5 weeks" and "the sandbox this and that: 6-9 months" and if you come with "links and content" you get a pile of "my site never had links and was indexed" as well as "well yahoo grabbed it anyway, why can't Google".

I don't want to sound condescending, and I love getting them up to speed and to a point where they know what they should be working on, but that's what it's like "out there" with private webmasters or small-business owners. They are not only uninformed, it's worse: they're misinformed. And sometimes it's even worse than that: they learn and use black-hat tricks from the late 90's.

I think that a part of the problem is that everyone wants to get into Google, but you never can get any real numbers out of Google. The link:-query always shows nothing (or some fantasy number), the site:-query is getting equaly poetic, everything else is "some of the pages" blah blah blah... Cold hard numbers would be great to show newbies how their site compares to others. "Look: this guy has 10 good links and he's got 100 pages indexed, you only have 2 mediocre links, that's why only your index page is indexed". Instead we have to say "check Yahoo for links, Google will probably know about them even if it says it doesn't but they're important anyway" -- who in their right mind actually believes that without knowing all the details? I wouldn't. I didn't until I ran my first batch of test-sites.

Validation is great in that regard: "here's a number, work on optimizing it" -- only you'll then get those who say "well google's pages aren't validated, why should I bother?" At leat it's good to see them think about it enough to actually compare :)


#10 rmccarley


    Light Speed Member

  • Members
  • 642 posts

Posted 22 December 2006 - 05:51 PM

W3C Validation is not an SEO issue. And that is probably why Google is so elusive about the whole subject... it would be overstepping their bounds. Make sure the site is crawlable, that navigation is in plain old HTML and not javascripts or Flash. That there are no serrious errors. Check your site in a text browser to be sure (hmmm... that sounds familiar). Everything beyond that is an issue for web designers, not SEOs.

#11 trinorthlighting


    New To Community

  • Members
  • 1 posts

Posted 23 January 2007 - 10:56 AM

We make sure all of our sites are w3c compliant. It takes care of a lot of issues. Reasons why we do it:

1. w3c complaint sites are 100% crawlable by google bot.
2. w3c compliant sites will be viewable in all browsers, IE, Firefox, etc....
3. Validating pages when we create them catches simple errors that can kill your serps such as unclosed title tags, etc.. :cheers: :applause:

#12 Guest_joedolson_*

  • Guests

Posted 23 January 2007 - 11:49 AM

I wrote an article recently on the subject of validation - Beyond Validation. I was shooting from the perspective of accessibility, but SEO has some very similar issues. Basically, what I said is that validation is nothing but a minor tool - it doesn't tell you anything about whether you've used semantic code, whether you've made barriers to use via javascript, whether you've chose impossible to read color combinations, or whether you've nested 300 tables with all inline styles inside each other to create your design.

Validation is extremely important from a standards commitment perspective - but it's very important to remember that validation really only means that you've written HTML or XHTML which matches the syntax rules laid down by the W3C - and that's really not saying that much. The W3C rules aren't all that strict.

RSS Feed

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users