Jump to content

Cre8asiteforums

Web Site Design, Usability, SEO & Marketing Discussion and Support

Sign in to follow this  
jonbey

Blocking Bad Bots With Htaccess

Recommended Posts

(couldn't decide where this post belongs, so it is here...)

 

I saw a suspicious bot in the logs, and remember what was mentioned here the other day about blocking people (bots) from accessing the site. So I Googled it. I came across this solution for htaccess:

 

RewriteEngine On

RewriteCond %{HTTP_USER_AGENT} ^scraper1 [OR]

RewriteCond %{HTTP_USER_AGENT} ^scraper2

RewriteRule ^.* - [F,L]

 

 

I have uploaded it, checked the site, "fetched as googlebot" in webmaster tools, and all appears OK.

 

So, what does it do?

Is is saying to redirect anything.anything to -

meaning, nowhere? like http://-

or is it doing something else?

 

Any suggestions for what to put in? I have a list, but may not be perfect, up to date. Shall I post what I am (hopefully) blocking here?

Share this post


Link to post
Share on other sites

Are you trying to block one scraper or a whole group of scrapers you have identified or are you trying to block a group of scrapers that someone else has posted somewhere?

 

The Rewrite rule is supposed to serve an error code to those bots.

Share this post


Link to post
Share on other sites

I can't tell how old that article is. It's generally better to set up your own bot trap and just block the bots that actually visit your site. There was an old thread at Webmasterworld that explained how to do that. It's a bit complicated and not for the faint-of-heart.

Share this post


Link to post
Share on other sites

is a complicated method really a lot better than what I have done? What if I just find say, the "10 worse scraper bots" (I assume such an article exists) that has been written recently, and use that?

 

Is there any harm in blocking the ones I have blocked?

Share this post


Link to post
Share on other sites

Unless you're serving 1,000,000 fetches a day there is probably no harm in blocking bots from an old list, although one man's bad bot could be another man's niche search engine that drives real traffic.

 

The people who used to discuss all this stuff openly for the SEO community usually advised anyone willing to do the work to customize these tools to their own sites' needs. Some people were blocking bots from major search engines simply because they didn't want to deal with those search engines. Copying such a list blindly would have been a bad thing for other people, though.

 

I just think you should know which bots are really your problem and that you're dealing with them. If that article was written 5 years ago, that bot list is probably way out of date and won't do you much good if any at all.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

×