Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


Photo

Blocking Bad Bots With Htaccess


  • Please log in to reply
5 replies to this topic

#1 jonbey

jonbey

    Eyes Like Hawk Moderator

  • Moderators
  • 4382 posts

Posted 01 December 2010 - 09:31 PM

(couldn't decide where this post belongs, so it is here...)

I saw a suspicious bot in the logs, and remember what was mentioned here the other day about blocking people (bots) from accessing the site. So I Googled it. I came across this solution for htaccess:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^scraper1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^scraper2
RewriteRule ^.* - [F,L]



I have uploaded it, checked the site, "fetched as googlebot" in webmaster tools, and all appears OK.

So, what does it do?
Is is saying to redirect anything.anything to -
meaning, nowhere? like http://-
or is it doing something else?

Any suggestions for what to put in? I have a list, but may not be perfect, up to date. Shall I post what I am (hopefully) blocking here?

#2 Michael_Martinez

Michael_Martinez

    Time Traveler Member

  • 1000 Post Club
  • 1354 posts

Posted 01 December 2010 - 09:51 PM

Are you trying to block one scraper or a whole group of scrapers you have identified or are you trying to block a group of scrapers that someone else has posted somewhere?

The Rewrite rule is supposed to serve an error code to those bots.

#3 jonbey

jonbey

    Eyes Like Hawk Moderator

  • Moderators
  • 4382 posts

Posted 01 December 2010 - 10:23 PM

it was a list I found that I am using.

The bot I spotted, which I think is not good, is IScraperBot/0.1

the list I used was this one; http://www.htaccess-...s-and-bad-bots/

#4 Michael_Martinez

Michael_Martinez

    Time Traveler Member

  • 1000 Post Club
  • 1354 posts

Posted 02 December 2010 - 02:49 PM

I can't tell how old that article is. It's generally better to set up your own bot trap and just block the bots that actually visit your site. There was an old thread at Webmasterworld that explained how to do that. It's a bit complicated and not for the faint-of-heart.

#5 jonbey

jonbey

    Eyes Like Hawk Moderator

  • Moderators
  • 4382 posts

Posted 02 December 2010 - 03:44 PM

is a complicated method really a lot better than what I have done? What if I just find say, the "10 worse scraper bots" (I assume such an article exists) that has been written recently, and use that?

Is there any harm in blocking the ones I have blocked?

#6 Michael_Martinez

Michael_Martinez

    Time Traveler Member

  • 1000 Post Club
  • 1354 posts

Posted 02 December 2010 - 06:01 PM

Unless you're serving 1,000,000 fetches a day there is probably no harm in blocking bots from an old list, although one man's bad bot could be another man's niche search engine that drives real traffic.

The people who used to discuss all this stuff openly for the SEO community usually advised anyone willing to do the work to customize these tools to their own sites' needs. Some people were blocking bots from major search engines simply because they didn't want to deal with those search engines. Copying such a list blindly would have been a bad thing for other people, though.

I just think you should know which bots are really your problem and that you're dealing with them. If that article was written 5 years ago, that bot list is probably way out of date and won't do you much good if any at all.



RSS Feed

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users