When Does A 404 Error Page Fall Out Of The Se Listings?
#1
Posted 30 April 2007 - 11:49 AM
One of my clients is an attorney and all of his partners have their own biography page on his website. One of the partners just left the firm and I removed her file from the server.
Now when her name shows up in the SE's, it's linked to a 404 error page I created, "Page cannot be found..."
My client doesn't want her name showing up at all in the SE's with a link to his website, even if it's to an error page.
I told my client that I can't control what the SE's put in their listings, or can I? Is there some form I can fill out to ask them to remove this page from their index?
At what point does a 404 error page fall out of the listings?
Thanks.
Risa
#2
Posted 30 April 2007 - 12:19 PM
User-agent: * Disallow: /this-page/bio.htm
There's no instantaneous method: but these will certainly expedite the process.
It's certainly faster than waiting for the search engines to decide the 404 really means it.
#3
Posted 30 April 2007 - 01:20 PM
Risa
#4
Posted 30 April 2007 - 02:13 PM
However, they both are respectful of robots.txt, so that would be the best way to go for them.
#5
Posted 30 April 2007 - 02:19 PM
There are two other possible strategies:
- You could keep the old URL online, with appropriately adjusted content (or even an empty page). This would make sense if the site (and that page) were being crawled regularly by the top engines. The new content will replace the old content in the search results. Once that has happened for the major engines, you could change remove the URL and have it 404.
- You could have the old URL 301 redirect to a related or new URL (a new partner or the general partner page). A 301 redirect is generally processed much faster than a missing URL (404). The 301 redirect could also be used to move the visitors who accidentally still land on the old URL (say from external links) to a replacement URL.
For both of these strategies you would have to make sure that the robots.txt does not block crawling of the URL.
Remember that the robost.txt does not remove the content, it just prevents the search engines from re-crawling the content. If you only use the robots.txt, it is possible that the URL remains indexed, with or without content.
The reason search engines keep 404 pages in the index for so long is that the 404 error code is technically a temporary error. The proper code for a page that is "gone" is 410. However, the search engines do not differentiate between these codes and will treat a 410 the same as a 404 and try to keep the page in the index for as long as it has sufficient value (on Google: through inbound links).
John
PS Yahoo lets you remove URLs through the SiteExplorer as well, but you will have to verify ownership of the site first, which takes about a day to get processed.
#6
Posted 30 April 2007 - 02:59 PM
PS Yahoo lets you remove URLs through the SiteExplorer as well, but you will have to verify ownership of the site first, which takes about a day to get processed.
Hmmm...didn't know that. Couldn't find it, at any rate!
#7
Posted 30 April 2007 - 07:35 PM
#8
Posted 30 April 2007 - 09:26 PM
#9
Posted 30 April 2007 - 10:05 PM
Is it a 404 response though?At what point does a 404 error page fall out of the listings?
Step 1: get webbug.
Step 2: Put the URL in the box, check the HTTP 1.1 radio button.
tep 3: click The GET Button.
Is it returning a 404, or is it returning the aweful 302 many sites return?if a 302, which BTW isn't a redirect but a Found response, then the SE assumes the page still exists.
The best bet is to do as Joe suggests and robots.txt block the page (maybe throw onb a robots noindex metatag as well).
#10
Posted 01 May 2007 - 10:37 AM
Risa
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users






