How to block bad bots on litespeed server?

masood_y

Well-Known Member
#1
I have a list, need to block on litespeed server; not just a a website.
How can i do it?

SetEnvIfNoCase User-Agent ^$ bad_bots
SetEnvIfNoCase User-Agent "AhrefsBot" bad_bots
SetEnvIfNoCase User-Agent "AITCSRobot" bad_bots
SetEnvIfNoCase User-Agent "Alexibot" bad_bots
SetEnvIfNoCase User-Agent "Arachnophilia" bad_bots
SetEnvIfNoCase User-Agent "archive\.org\_bot" bad_bots
SetEnvIfNoCase User-Agent "ASpider" bad_bots
SetEnvIfNoCase User-Agent "BackDoorBot" bad_bots
SetEnvIfNoCase User-Agent "Baiduspider" bad_bots
SetEnvIfNoCase User-Agent "BSpider" bad_bots
SetEnvIfNoCase User-Agent "CFNetwork" bad_bots
SetEnvIfNoCase User-Agent "CyberPatrol" bad_bots
SetEnvIfNoCase User-Agent "DeuSu" bad_bots
SetEnvIfNoCase User-Agent "DotBot" bad_bots
SetEnvIfNoCase User-Agent "EmailCollector" bad_bots
SetEnvIfNoCase User-Agent "Exabot" bad_bots
SetEnvIfNoCase User-Agent "FeedlyBot" bad_bots
SetEnvIfNoCase User-Agent "Genieo" bad_bots
SetEnvIfNoCase User-Agent "Gluten\ Free\ Crawler" bad_bots
SetEnvIfNoCase User-Agent "GrapeshotCrawler" bad_bots
SetEnvIfNoCase User-Agent "MaxPointCrawler" bad_bots
SetEnvIfNoCase User-Agent "meanpathbot" bad_bots
SetEnvIfNoCase User-Agent "MJ12bot" bad_bots
SetEnvIfNoCase User-Agent "PagesInventory" bad_bots
SetEnvIfNoCase User-Agent "PHP" bad_bots
SetEnvIfNoCase User-Agent "Plukkie" bad_bots
SetEnvIfNoCase User-Agent "Qwantify" bad_bots
SetEnvIfNoCase User-Agent "SemrushBot" bad_bots
SetEnvIfNoCase User-Agent "SentiBot" bad_bots
SetEnvIfNoCase User-Agent "SEOkicks\-Robot" bad_bots
SetEnvIfNoCase User-Agent "SeznamBot" bad_bots
SetEnvIfNoCase User-Agent "spbot" bad_bots
SetEnvIfNoCase User-Agent "WeSEE\_Bot" bad_bots
SetEnvIfNoCase User-Agent "Wget" bad_bots
SetEnvIfNoCase User-Agent "worldwebheritage\.org" bad_bots
SetEnvIfNoCase User-Agent "Xenu\ Link\ Sleuth" bad_bots
SetEnvIfNoCase User-Agent "Yahoo!\ Slurp" bad_bots
SetEnvIfNoCase User-Agent "Zeus" bad_bots

<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bots
</Limit>
 

edigest

Active Member
#3
Have a quick question related to this -- hope you don't mind piggy-backing on this thread rather than opening a new one.

I do something similar to what Masood is doing in htaccess, except that I add some of the bots to

/var/cpanel/templates/apache2_4/vhost.local

It seems that there is a limit to the number of references I can add to the vhost config. Is this due to Max Request URL Length or Max Request Header Size? If so, what are the potential downsides to increasing the header size?
 

edigest

Active Member
#5
I have up to 17 statements like this in the <VirtualHost> container in the cPanel vhost config: (in /var/cpanel/templates/apache2_4/vhost.local)

RewriteCond %{HTTP_USER_AGENT} "Parser::Template" [NC,OR]

I put UAs in there that seem to not get blocked effectively by mod_sec, to include MJ12 and others. I follow the condition statements with

RewriteRule .* - [E=blockbot:1]

This works very well but, by trial and error, I've found websites begin generating random 50x errors if there are more than about 17 statements. I was just wondering if there was a limit to the size of the vhost configuration in memory or some similar issue.
 

mistwang

LiteSpeed Staff
#6
50x errors may not related to the rewrite rule. if it is 503, please follow our PHP 503 trouble shooting guide in our wiki.
 
Top