Jump to content

Recommended Posts

Hi again

 

I have made some bad bot postings to the search engine forum, but really I think it all belongs here in the security forum. I would like to maintain this thread in one place and share some of the stuff I run into.

 

The latest bad actor I found comes from utel.net.ua, a Ukranian source that also seems to serve its neighbor Poland.

 

Running the linux command on apache access log

 

tail -f /var/log/httpd/access_log

 

showed continuing, repeated hits across several domains.

 

The command

 

grep utel.net.ua /var/log/httpd/access_log |awk '{print $2}' |sort |uniq > utel.net.ua

 

produced a list of results

 

213.186.119.131.utel.net.ua

213.186.119.132.utel.net.ua

213.186.119.133.utel.net.ua

213.186.119.134.utel.net.ua

213.186.119.135.utel.net.ua

213.186.119.136.utel.net.ua

213.186.119.137.utel.net.ua

213.186.119.138.utel.net.ua

213.186.119.139.utel.net.ua

213.186.119.140.utel.net.ua

213.186.119.141.utel.net.ua

213.186.119.142.utel.net.ua

213.186.119.143.utel.net.ua

213.186.119.144.utel.net.ua

213.186.120.196.utel.net.ua

213.186.122.2.utel.net.ua

213.186.122.3.utel.net.ua

213.186.127.10.utel.net.ua

213.186.127.12.utel.net.ua

213.186.127.13.utel.net.ua

213.186.127.14.utel.net.ua

213.186.127.28.utel.net.ua

213.186.127.2.utel.net.ua

213.186.127.3.utel.net.ua

213.186.127.4.utel.net.ua

213.186.127.5.utel.net.ua

213.186.127.6.utel.net.ua

213.186.127.7.utel.net.ua

213.186.127.8.utel.net.ua

213.186.127.9.utel.net.ua

 

First checked that these IP numbers were accurate using nslookup

 

nslookup 213.186.119.131.utel.net.ua

Server: 8.8.8.8

Address: 8.8.8.8#53

 

Non-authoritative answer:

Name: 213.186.119.131.utel.net.ua

Address: 213.186.119.131

 

Then went to http://ip2cidr.com/, entered the first and last IP numbers, and this produced the list

 

213.186.119.131/32

213.186.119.132/30

213.186.119.136/29

213.186.119.144/28

213.186.119.160/27

213.186.119.192/26

213.186.120.0/22

213.186.124.0/23

213.186.126.0/24

213.186.127.0/29

213.186.127.8/31

 

Not 100% useful, OK? Maybe I'm just not smart enough.....

 

Then on to http://magic-cookie.co.uk/iplist.html, entered the first IP on the original list and played with the secondary number, which identifies how deep to go into netblocks. A few experiments came up with 213.186.119.131/19, and the list of 8192 IP numbers blocked stretches from 213.186.96.0 to 213.186.127.255

 

Then checked with http://www.maxmind.com and ran both the first and last IP numbers. They both belong to utel, so that makes it pretty certain that everything in-between is theirs also.

 

Then came the command (as root of course)

 

/sbin/iptables -p tcp -I INPUT -j DROP -s 213.186.119.131/19 && /etc/init.d/iptables save && /etc/init.d/sshd restart

 

(some of these paths may vary for different linux flavors, this is centos)

 

Now I sit watching results for tail -f /var/log/httpd/access_log | grep utel

 

Nothing. Zero. Zip. Nada.

 

Of course this method runs the risk of blocking traffic that you might want -- for example, possibly some users of utel wireless smart phones might not be able to access my sites -- but it seems to me that bots from Ukraine are not doing me a lot of good.

 

Hope this info is useful

 

Cheers, Mike

Share this post


Link to post
Share on other sites

To complete what bobbb started:

There are five Regional Internet Registries (RIRs). Their whois search services:

1. America Registry for Internet Numbers (ARIN): Canada, USA, Antarctica, parts of Caribbean.

http://whois.arin.net/ui/

 

2. Réseaux IP Européens Network Coordination Centre (RIPE NCC): Europe, Russia, Middle East, Central Asia.

https://apps.db.ripe...arch/query.html

 

3. Asia Pacific Network Information Centre (APNIC): Asia (except Russia, Central Asia), Australia, New Zealand, Japan, Philippines, Indonesia, etc.

http://wq.apnic.net/apnic-bin/whois.pl

 

4. Internet Address Registry for Latin America and the Caribbean (LACNIC): Mexico, Central and South America, parts of Caribbean.

http://lacnic.net/cgi-bin/lacnic/whois

 

5. Internet Numbers Registry for Africa (AfriNIC): Africa, Madagascar.

http://www.afrinic.n...ces/whois-query

 

Each offers a number of public and member services and applications beyond whois, worth a look.

Share this post


Link to post
Share on other sites

Another bad actor:

 

clients.your-server.de

 

/sbin/iptables -p tcp -I INPUT -j DROP -s 176.9.0.118/18

/sbin/iptables -p tcp -I INPUT -j DROP -s 188.40.39.212/17

/sbin/iptables -p tcp -I INPUT -j DROP -s 213.133.123.53

/sbin/iptables -p tcp -I INPUT -j DROP -s 213.239.193.170

/sbin/iptables -p tcp -I INPUT -j DROP -s 46.4.100.231/17

/sbin/iptables -p tcp -I INPUT -j DROP -s 5.9.22.170/17

/sbin/iptables -p tcp -I INPUT -j DROP -s 78.46.145.100/17

/sbin/iptables -p tcp -I INPUT -j DROP -s 88.198.234.84/17

 

That's a lotta IP addresses to block, all belonging to Hetzner Online AG

 

http://www.hetzner.d.../rechenzentrum/

 

Tell me if I'm shooting myself in the foot.....

 

BTW it dropped my server cpu load from 12.64 to 1.21...

 

Cheers

Mike

Share this post


Link to post
Share on other sites

There are also a lot of bad players on the amazonaws.com IP ranges. I guess it depends on a definition of bad player.

Share this post


Link to post
Share on other sites

Yep, I try not to take a moral stance, I run spiders also from time to time (although I keep them single-threaded, one query at a time).....but when I see my server bogging down, it's just self-defense

Share this post


Link to post
Share on other sites

Hello all

 

Just found another bad boy, choopa.net

 

 

/sbin/iptables -p tcp -I INPUT -j DROP -s 173.199.114.115/20

 

/etc/init.d/iptables save && /etc/init.d/sshd restart

Share this post


Link to post
Share on other sites

But how do you know that the whole range is bad?

choopa is just a web hosting company like GoDaddy or HostGator using the range 173.199.64 through 127

 

You have just blocked 173.199.112.1 to 173.199.127.254 for 4094 IPs

There are obviously no users surfing out of that range just possible bots.

Share this post


Link to post
Share on other sites

Hey Bob, that's a very good point, one I have puzzled over again and again. As you point out, hosting companies generally are not ISPs providing access for visitors or viewers.

 

Therefore, are they doing me any good?

 

Many of them do not have much enforcement.... aws was a joke, their site said fill out a form, the form didn't work! Also, many like aws rotate their IPs in a random, cloud-like way, so if you block an IP today, the bot comes back with another IP tomorrow. I did check that the first and last IP numbers belong to choopa.

 

For me the bottom line in this risk/reward analysis is that I don't have the time to be screwing around catching unfriendly bots. Scraping content doesn't really bother me that much, but server load is a serious problem.

 

Cheers

Mike

Share this post


Link to post
Share on other sites

Just a FYI

 

hostnoc.net just made my hit list.

 

Since I don't have access to iptables I need to do it in htaccess so I have to evaluate loading htaccess vs is-this-guy-enough-of-a-pain-in-the-butt. My evaluation is the reverse of yours. I know I can't win but here is my chance to give a scraper the 403 finger. They have more IPs than below but they did not meet the criteria.

 

184.22/16

184.82/16

Share this post


Link to post
Share on other sites

Hey Bob

 

I took a look, and hostnoc.net shows up repeatedly in my access_log but not rude. Though you have a point, they are not doing me any good, and the content may well be competing with mine...

 

Cheers

Mike

Share this post


Link to post
Share on other sites

Hello,

 

I've read some place that you can put in poisoned files to catch scrapers more or less automatically. The idea was that bad bots will ignore robots text while good bots won't. So you create a page, put it off limits to robots, invisible to regular visitors, and then; as I recall. there was a way to automatically ban any ip that visited that page. Not sure if something like that would work in this case or not but I thought I'd mention it.

 

Walter

Share this post


Link to post
Share on other sites

As you say bad bots don't read the robots.txt so they may never notice. We call them bots but really they are just scrapers with a robot type program that follow links from your main page then just drill down. Sometimes you can see them trying to access sitemap.xml. They are dummies, sitemaps don't have to be called that.

 

I tried the idea of a secret file some time back but no one ever came.

Share this post


Link to post
Share on other sites

Morning,

 

I found that article, its actually quite old and was a forum post. They used a PERL script. The script bans the ip and sends an email notification to the admin. Its probably out dated and maybe there are better solutions now but if anyone wants to take a look at it you can find it at:

 

http://www.webmasterworld.com/forum13/1823.htm

 

Seemed like a clever idea to me. Although, there is a thread running here on Cre8t a site that says Google bots don't always follow robots text either so maybe the idea is flawed from the start.

 

Something I use is Project Honey Pot and it can be found here:

 

https://www.projecthoneypot.org/index.php

 

They collect and share "bad" ips.

 

I'll also say that I've geo blocked certain parts of the globe. Its drastically cut down on the security exceptions for my site. I know its not an option for everyone but in my case it made sense.

 

Walter

Share this post


Link to post
Share on other sites

Maybe I install a "secret" file again and see who trips the wire. I have never barred Google so I can't know.

 

Had a look at the honeyproject link. The top user agent is Java. They have been on my hit list for years along with libwww-perl and Python. The only reason to use those are to scrape. No surprise on top harvester country

Share this post


Link to post
Share on other sites

Just nailed another one, xlhost.com,

 

/sbin/iptables -p tcp -I INPUT -j DROP -s 209.190.0.0/17

 

I'm sure this won't get everybody on that hosting system, but it stopped whoever was hammering my server up to 56% cpu consumption

 

Cheers

Share this post


Link to post
Share on other sites

I've picked up another one. Oh they are good. Hard to pick-up.

 

All seems to be coming from someone called 5280enterprises.com/proxy51.com. Never uses the same user agent on requests spread across 7 IP ranges using 5 different suppliers (DataShack, wholesaleinternet.net, Eonix.net, lionlink.net, EGIHosting). The IPs resolve to all kinds of domain names.

 

It was the agent and referer that gave it away. I just happened to look at them and said "What are all these agents about?" and "How come my opening page is a referer to so many pages."

Share this post


Link to post
Share on other sites

Hi All, just updating the amazonaws IP list

 

from https://forums.aws.amazon.com/ann.jspa?annID=1701
May 24, 2013

US East (Northern Virginia):

72.44.32.0/19 (72.44.32.0 - 72.44.63.255)
67.202.0.0/18 (67.202.0.0 - 67.202.63.255)
75.101.128.0/17 (75.101.128.0 - 75.101.255.255)
174.129.0.0/16 (174.129.0.0 - 174.129.255.255)
204.236.192.0/18 (204.236.192.0 - 204.236.255.255)
184.73.0.0/16 (184.73.0.0 – 184.73.255.255)
184.72.128.0/17 (184.72.128.0 - 184.72.255.255)
184.72.64.0/18 (184.72.64.0 - 184.72.127.255)
50.16.0.0/15 (50.16.0.0 - 50.17.255.255)
50.19.0.0/16 (50.19.0.0 - 50.19.255.255)
107.20.0.0/14 (107.20.0.0 - 107.23.255.255)
23.20.0.0/14 (23.20.0.0 – 23.23.255.255)
54.242.0.0/15 (54.242.0.0 – 54.243.255.255)
54.234.0.0/15 (54.234.0.0 – 54.235.255.255)
54.236.0.0/15 (54.236.0.0 – 54.237.255.255)
54.224.0.0/15 (54.224.0.0 - 54.225.255.255)
54.226.0.0/15 (54.226.0.0 - 54.227.255.255)
54.208.0.0/15 (54.208.0.0 - 54.209.255.255)
54.210.0.0/15 (54.210.0.0 - 54.211.255.255)
54.221.0.0/16 (54.221.0.0 - 54.221.255.255) NEW

US West (Oregon):

50.112.0.0/16 (50.112.0.0 - 50.112.255.255)
54.245.0.0/16 (54.245.0.0 – 54.245.255.255)
54.244.0.0/16 (54.244.0.0 - 54.244.255.255)
54.214.0.0/16 (54.214.0.0 - 54.214.255.255)
54.212.0.0/15 (54.212.0.0 - 54.213.255.255) NEW
54.218.0.0/16 (54.218.0.0 - 54.218.255.255) NEW

US West (Northern California):

204.236.128.0/18 (204.236.128.0 - 204.236.191.255)
184.72.0.0/18 (184.72.0.0 – 184.72.63.255)
50.18.0.0/16 (50.18.0.0 - 50.18.255.255)
184.169.128.0/17 (184.169.128.0 - 184.169.255.255)
54.241.0.0/16 (54.241.0.0 – 54.241.255.255)
54.215.0.0/16 (54.215.0.0 – 54.215.255.255)
54.219.0.0/16 (54.219.0.0 - 54.219.255.255) NEW

EU (Ireland):

79.125.0.0/17 (79.125.0.0 - 79.125.127.255)
46.51.128.0/18 (46.51.128.0 - 46.51.191.255)
46.51.192.0/20 (46.51.192.0 - 46.51.207.255)
46.137.0.0/17 (46.137.0.0 - 46.137.127.255)
46.137.128.0/18 (46.137.128.0 - 46.137.191.255)
176.34.128.0/17 (176.34.128.0 - 176.34.255.255)
176.34.64.0/18 (176.34.64.0 – 176.34.127.255)
54.247.0.0/16 (54.247.0.0 – 54.247.255.255)
54.246.0.0/16 (54.246.0.0 – 54.246.255.255)
54.228.0.0/16 (54.228.0.0 - 54.228.255.255)
54.216.0.0/15 (54.216.0.0 - 54.217.255.255)
54.229.0.0/16 (54.229.0.0 - 54.229.255.255)
54.220.0.0/16 (54.220.0.0 - 54.220.255.255) NEW

Asia Pacific (Singapore)

175.41.128.0/18 (175.41.128.0 - 175.41.191.255)
122.248.192.0/18 (122.248.192.0 - 122.248.255.255)
46.137.192.0/18 (46.137.192.0 - 46.137.255.255)
46.51.216.0/21 (46.51.216.0 - 46.51.223.255)
54.251.0.0/16 (54.251.0.0 – 54.251.255.255)
54.254.0.0/16 (54.254.0.0 – 54.254.255.255)
54.255.0.0/16 (54.255.0.0 – 54.255.255.255)

Asia Pacific (Sydney)

54.252.0.0/16 (54.252.0.0 – 54.252.255.255)
54.253.0.0/16 (54.253.0.0 – 54.253.255.255)

Asia Pacific (Tokyo)

175.41.192.0/18 (175.41.192.0 - 175.41.255.255)
46.51.224.0/19 (46.51.224.0 - 46.51.255.255)
176.32.64.0/19 (176.32.64.0 - 176.32.95.255)
103.4.8.0/21 (103.4.8.0 - 103.4.15.255)
176.34.0.0/18 (176.34.0.0 - 176.34.63.255)
54.248.0.0/15 (54.248.0.0 - 54.249.255.255)
54.250.0.0/16 (54.250.0.0 - 54.250.255.255)
54.238.0.0/16 (54.238.0.0 - 54.238.255.255) NEW

South America (Sao Paulo)

177.71.128.0/17 (177.71.128.0 - 177.71.255.255)
54.232.0.0/16 (54.232.0.0 – 54.232.255.255)
54.233.0.0/18 (54.233.0.0 – 54.233.63.255)

GovCloud

96.127.0.0/18 (96.127.0.0 - 96.127.63.255)

Cheers

Mike

Share this post


Link to post
Share on other sites

I pinned this for you.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×