Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
litespeed_wiki:cache:lscwp:configuration:enabling_the_crawler [2019/08/05 15:07]
Jackson Zhang
litespeed_wiki:cache:lscwp:configuration:enabling_the_crawler [2020/11/14 15:16] (current)
Lisa Clarke Redirect to new Documentation Site
Line 1: Line 1:
-====== Enabling and Limiting the Crawler ====== +~~REDIRECT>​https://​docs.litespeedtech.com/lscache/lscwp/admin/~~
- +
-These instructions apply to the WordPress LSCache crawler and other CMS LSCache crawlers where available. +
- +
-Due to the potential of the crawler to consume considerable resources, we have put the on/off switch in the hands of the server administrators. On a control panel environment,​ such as cPanel, the crawler is disabled by default and can only be enabled by an admin through Apache configuration. ​ While on LSWS native environment,​ the crawler is enabled by default and can be disabled on server level or virtual host level starting from LSWS 5.3.5 release.  +
- +
-**NOTE: it is not recommended to turn on the crawler for shared hosting setups unless the server has enough capacity to handle it! +
-** +
- +
-===== On a shared hosting/​control panel environment===== +
-==== Enabling the Crawler on a shared hosting/​control panel environment === +
-As of LSWS v5.1.16*, there are a few different approaches you can take to crawling on your server: +
-  *You can disable it for the entire server +
-  *You can enable it for the entire server +
-  *You can selectively enable it for particular clients, while leaving it disabled for everyone else +
-  +
-To enable the crawler in either of the second two scenarios, you need to add this “Crawler Snippet” to the appropriate configuration or include file: +
- +
-<code> +
-<​IfModule Litespeed>​ +
- ​CacheEngine on crawler +
-</​IfModule>​ +
-</​code>​ +
- +
-The exact location of the relevant configuration or include file varies, depending on the control panel you use (or if you use no control panel at all), and which of the above options you are looking to enact. See below for instructions relevant to your setup. +
- +
-After you've added the Crawler Snippet in the appropriate location, you should gracefully restart the server. +
- +
-*If you are on v5.1.16 and having difficulty getting this to work, please force reinstall to the latest build. +
- +
-==== Limiting the Crawler ==== +
-Currently, the following variables are available for use with [[litespeed_wiki:​cache:​lscwp:​configuration:​crawler|the Crawler function]]:​ +
-  * ''​CRAWLER_USLEEP''​ puts a minimum allowed value on the **Delay** field. +
-  * ''​CRAWLER_LOAD_LIMIT''​ sets a default for the **Server Load Limit** field. +
-  * ''​CRAWLER_LOAD_LIMIT_ENFORCE''​ sets a maximum allowed value on the **Server Load Limit** field. +
- +
-To use these variables, add them one-per-line to the appropriate configuration file. For example: +
-<​code>​ +
-<​IfModule LiteSpeed>​ +
-CacheEngine on crawler +
-SetEnv CRAWLER_USLEEP 1000 +
-SetEnv CRAWLER_LOAD_LIMIT 5.2 +
-</​IfModule>​ +
-</​code>​ +
- +
-==== cPanel/WHM ==== +
- +
-=== Server level === +
- +
-Change your working directory to: +
-''/​usr/​local/​apache/​conf/​includes/''​ for EA3 or  +
-''/​etc/​apache2/​conf.d/​includes/''​ for EA4. +
-  +
-Add the Crawler Snippet and optional server variables to the ''​pre_main_global.conf''​ file. +
- +
-=== Global virtual host level === +
- +
-Change your working directory to: +
-''/​usr/​local/​apache/​conf/​userdata/''​for EA3 or +
-''/​etc/​apache2/​conf.d/​userdata/''​ for EA4  +
-  +
-If these directories do not exist, create them.  +
-  +
-Add the Crawler Snippet and optional server variables to the ''​lscache_vhosts.conf''​ file. +
-  +
-Apply these changes to all Virtual Hosts by running the following command: +
- +
-  /​scripts/​ensure_vhost_includes --all-users +
-  +
-//Note: You only need to run this command once and it will activate for all users, including new users created by WHM later. There is no need to edit the cPanel skeleton file.// +
- +
-=== Individual virtual host level === +
- +
-Change your working directory to: +
-  - For EA3: ''/​usr/​local/​apache/​conf/​userdata/​std/​2_4/<​user>/<​domain>/''​ +
-  - For EA4: ''/​etc/​apache2/​conf.d/​userdata/​std/​2_4/<​user>/<​domain>/''​ +
-If your site support ​https(ssl), please also change working directory to: +
-  - For EA3: ''​/usr/local/​apache/​conf/​userdata/​ssl/​2_4/<​user>/<​domain>/''​ +
-  - For EA4: ''/​etc/​apache2/​conf.d/​userdata/​ssl/​2_4/<​user>/<​domain>/''​ +
-* Above example path of ''​2_4''​ can be other version of your apache'​s,​ e.g. 2, 2_2   +
-  +
-If these directories do not exist, create them.  +
-  +
-Add the Crawler Snippet and optional server variables to the ''​lscache_vhosts.conf''​ file. This will enable the crawler for this Virtual Host only. +
-  +
-Apply these changes by running the following command: +
-  +
-  /​scripts/​ensure_vhost_includes --user=$user +
- +
-==== Plesk ==== +
- +
-=== Server level === +
- +
-Change your working directory to: +
-''/​etc/​httpd/​conf.d/''​ for CentOS +
-''/​etc/​apache2/​conf.d/''​ for Debian +
-''/​etc/​apache2/conf-enabled''​ for Ubuntu +
-  +
-Add the Crawler Snippet and optional server variables to ''​lscache.conf''​. If it doesn’t exist, create it. +
- +
-=== Global virtual host level === +
- +
-Change your working directory to ''​/usr/​local/​psa/admin/conf/​templates/​custom/​domain''​ +
-Create it if it doesn’t exist.  +
-Copy''/​usr/​local/​psa/​admin/​conf/​templates/​default/​domain/​domainVirtualHost.php''​ to this location.  +
-  +
-Edit the file and add the Crawler Snippet and optional server variables after the ''​mod_suexec.c''​ block. +
-  +
-Reconfigure all virtual hosts (this will regenerate new configuration files for all vhosts): +
-  +
-  /​usr/​local/​psa/​admin/​bin/​httpdmng --reconfigure-all +
- +
-=== Individual virtual host level === +
- +
-Change your working directory to ''/​var/​www/​vhosts/​system/<​domain_name>/​conf/''​ +
-Create a file called ''​vhost.conf''​ if it does not already exist ( or ''​vhost_ssl.conf''​ for HTTPS sites). +
-Add the Crawler Snippet and optional server variables to this file. +
-  +
-Reconfigure this Virtual Host (this will regenerate new configuration files for this vhost): +
-  +
-  /​usr/​local/​psa/​admin/​bin/​httpdmng --reconfigure-domain <​domain_name>​ +
- +
-==== DirectAdmin ==== +
- +
-=== Server level === +
- +
-Add the Crawler Snippet and optional server variables to the ''/​etc/​httpd/​conf/​extra/​httpd-includes.conf''​ file. +
-Global virtual host level +
-Create a ''/​usr/​local/​directadmin/​data/​templates/​custom/​cust_httpd.CUSTOM.2.pre''​ file and add the Crawler Snippet and optional server variables to it. +
-  +
-Apply these changes to all Virtual Hosts by running the following commands: +
- +
-<​code>​  +
-cd /​usr/​local/​directadmin/​custombuild +
-./build rewrite_confs +
-</​code>​ +
- +
-==== ''​CacheEngine -crawler''​ ==== +
-Starting from LSWS 5.3.5 or later, in any situation, if you just want to ensure to disable crawler for apache virtual host, you can add ''​CacheEngine -crawler''​ to the Apache virtual host configuration. ​  +
-   +
-  <​IfModule LiteSpeed>​ +
-  CacheEngine -crawler +
-  </​IfModule>​ +
- +
-''​CacheEngine -crawler''​(this is supported in LSWS v5.3.5 and later) in  +
- +
- +
-===== In a LiteSpeed Native Environment ===== +
-The cache crawler is enabled by default in a LSWS native environment.  +
- +
-To disable it at the Server Level, you will need to use LSWS 5.4 and above version, since there is a new **Cache Features** function added to control this. +
- +
-In the LSWS WebAdmin interface, navigate to **LSWS Admin > Configuration > Server > Cache**. In **Cache Features**, check ''​On'',​ uncheck ''​Crawler'',​ check ''​ESI'',​ and uncheck ''​Not Set''​. +
- +
-If ''​Not Set''​ is checked, the other three values will be ignored and the default values will be used. (By default, all three are checked.) +
- +
-{{:​litespeed_wiki:​cache:​lscwp:​configuration:​disable-crawler-lsws-native-1.png?​600|}} +
- +
-To disable the cache crawler at the  LSWS native Virtual Host level, you can go to **LSWS Admin > Configuration > Virtual Host > VH Name > Cache >**, and set **Cache Features** in the same manner as above. If ''​Not Set''​ is checked, the other three values will be ignored and the server-level configuration will be inherited. +
- +
-Please note: Do not set **Enable LiteMage** to ''​On'',​ as this setting will also enable the crawler, even if ''​Crawler''​ is unchecked.  +
- +
-{{:​litespeed_wiki:​cache:​lscwp:​configuration:​disable-crawler-lsws-native-vh-1.png?​600|}} +
- +
-To add any of the optional server variables, navigate to **Server > External App** and add the variable(s) to the **Environment** setting, one per line. For example: +
-<​code>​ +
-CRAWLER_USLEEP=1000 +
-CRAWLER_LOAD_LIMIT=5.2 +
-</​code>​ +
- +
-{{:​litespeed_wiki:​cache:​lscwp:​configuration:​lscwp-admin-crawler.png?​600|}} +
- +
-===== Testing ===== +
-LiteSpeed Web server cache engine will set environment varibles for ''​X-LSCACHE''​. You can always check Envirment Variables through phpinfo page to see if crawler is on or not. If the crawler is not there, then it has been disabled successfully. LSWS can only disable the LiteSpeed cache plugin or LiteSpeed crawler since such LiteSpeed crawlers will check ''​X_LSCACHE''​ environment variable. LSWS can not stop any third party crawler from working since they don't check ''​X_LSCACHE''​ to act accordingly. ​  +
- +
-  $_SERVER['​X-LSCACHE'​] on,​esi +
- +
-{{:​litespeed_wiki:​cache:​lscwp:​configuration:​disable-crawler-lsws-native-phpinfo-1.png?​600|}} ​  +
- +
-In the LiteSpeed cache for WordPress plugin, under **Settings > Crawler**, it should show **Crawler Cron** set to ''​Disable'',​ and  +
- +
-  Warning: The crawler feature is not enabled on the LiteSpeed server. Please consult your server admin. +
- +
-{{:​litespeed_wiki:​cache:​lscwp:​configuration:​disable-crawler-lsws-native-wp-plugin-cron-status-1.png?​800|}} +
  • Admin
  • Last modified: 2019/08/05 15:07
  • by Jackson Zhang