Jump to content

Leading Community for Usability, Search Engine Marketing,
Social Networking, Site Planning & Web Site Development, Since 1998


Photo

When Does A 404 Error Page Fall Out Of The Se Listings?


  • Please log in to reply
9 replies to this topic

#1 RisaBB

RisaBB

    Eyes Like Hawk Moderator

  • Moderators
  • 1436 posts

Posted 30 April 2007 - 11:49 AM

Hello,

One of my clients is an attorney and all of his partners have their own biography page on his website. One of the partners just left the firm and I removed her file from the server.

Now when her name shows up in the SE's, it's linked to a 404 error page I created, "Page cannot be found..."

My client doesn't want her name showing up at all in the SE's with a link to his website, even if it's to an error page.

I told my client that I can't control what the SE's put in their listings, or can I? Is there some form I can fill out to ask them to remove this page from their index?

At what point does a 404 error page fall out of the listings?

Thanks.

Risa

#2 joedolson

joedolson

    Eyes Like Hawk Moderator

  • Technical Administrators
  • 2870 posts
  • Twitter:http://twitter.com/joedolson
  • Facebook:http://facebook.com/joedolson

Posted 30 April 2007 - 12:19 PM

Using Google's webmaster console, you can now request removal of a specific URL, which would be pretty effective. You can also use robots.txt to block the page from being crawled:

User-agent: *
Disallow: /this-page/bio.htm

There's no instantaneous method: but these will certainly expedite the process.

It's certainly faster than waiting for the search engines to decide the 404 really means it.

#3 RisaBB

RisaBB

    Eyes Like Hawk Moderator

  • Moderators
  • 1436 posts

Posted 30 April 2007 - 01:20 PM

Thanks, Joe. Does MSN and Yahoo have this option? It is in MSN that this attorney is showing up #1 in the listings.

Risa

#4 joedolson

joedolson

    Eyes Like Hawk Moderator

  • Technical Administrators
  • 2870 posts
  • Twitter:http://twitter.com/joedolson
  • Facebook:http://facebook.com/joedolson

Posted 30 April 2007 - 02:13 PM

Neither MSN nor Yahoo! have anything like Google's webmaster console - Yahoo!'s SiteExplorer is related, but doesn't offer much in the way of tools - just gives you a little extra information.

However, they both are respectful of robots.txt, so that would be the best way to go for them.

#5 JohnMu

JohnMu

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3518 posts

Posted 30 April 2007 - 02:19 PM

Google will cache a page that is returning 404 for quite some time (sometimes years) - it makes sense to manually remove the page there. The other search engines are much faster.

There are two other possible strategies:

- You could keep the old URL online, with appropriately adjusted content (or even an empty page). This would make sense if the site (and that page) were being crawled regularly by the top engines. The new content will replace the old content in the search results. Once that has happened for the major engines, you could change remove the URL and have it 404.

- You could have the old URL 301 redirect to a related or new URL (a new partner or the general partner page). A 301 redirect is generally processed much faster than a missing URL (404). The 301 redirect could also be used to move the visitors who accidentally still land on the old URL (say from external links) to a replacement URL.

For both of these strategies you would have to make sure that the robots.txt does not block crawling of the URL.

Remember that the robost.txt does not remove the content, it just prevents the search engines from re-crawling the content. If you only use the robots.txt, it is possible that the URL remains indexed, with or without content.

The reason search engines keep 404 pages in the index for so long is that the 404 error code is technically a temporary error. The proper code for a page that is "gone" is 410. However, the search engines do not differentiate between these codes and will treat a 410 the same as a 404 and try to keep the page in the index for as long as it has sufficient value (on Google: through inbound links).

John

PS Yahoo lets you remove URLs through the SiteExplorer as well, but you will have to verify ownership of the site first, which takes about a day to get processed.

#6 joedolson

joedolson

    Eyes Like Hawk Moderator

  • Technical Administrators
  • 2870 posts
  • Twitter:http://twitter.com/joedolson
  • Facebook:http://facebook.com/joedolson

Posted 30 April 2007 - 02:59 PM

PS Yahoo lets you remove URLs through the SiteExplorer as well, but you will have to verify ownership of the site first, which takes about a day to get processed.


Hmmm...didn't know that. Couldn't find it, at any rate!

#7 phaithful

phaithful

    Light Speed Member

  • Members
  • 800 posts

Posted 30 April 2007 - 07:35 PM

You can read more about it on the Yahoo! Search Blog under Delete URLs.

#8 bobbb

bobbb

    Time Traveler Member

  • 1000 Post Club
  • 1449 posts

Posted 30 April 2007 - 09:26 PM

If archive.org has it, it may take even more years to get rid of it. I found something there that I had lost and it dated to 2000.

#9 projectphp

projectphp

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 3934 posts
  • Twitter:motherwell
  • Facebook:http://www.facebook.com/mmotherwell

Posted 30 April 2007 - 10:05 PM

At what point does a 404 error page fall out of the listings?

Is it a 404 response though?

Step 1: get webbug.
Step 2: Put the URL in the box, check the HTTP 1.1 radio button.
tep 3: click The GET Button.

Is it returning a 404, or is it returning the aweful 302 many sites return?if a 302, which BTW isn't a redirect but a Found response, then the SE assumes the page still exists.

The best bet is to do as Joe suggests and robots.txt block the page (maybe throw onb a robots noindex metatag as well).

#10 RisaBB

RisaBB

    Eyes Like Hawk Moderator

  • Moderators
  • 1436 posts

Posted 01 May 2007 - 10:37 AM

Thanks, everybody!

Risa




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users