Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
litespeed_wiki:cache:lscps:crawler [2018/07/20 23:00]
Eric Leu [More Options]
litespeed_wiki:cache:lscps:crawler [2019/10/23 17:41]
Eric Leu [More Options]
Line 3: Line 3:
 The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter un-cached pages. The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter un-cached pages.
  
-===== Before ​you Begin =====+===== Before ​You Begin =====
   - Install and enable [[https://​www.litespeedtech.com/​support/​wiki/​doku.php/​litespeed_wiki:​cache:​lscps | LiteSpeed Cache for Prestashop]]   - Install and enable [[https://​www.litespeedtech.com/​support/​wiki/​doku.php/​litespeed_wiki:​cache:​lscps | LiteSpeed Cache for Prestashop]]
-  - Crawler Engine ​\\ The crawler must be enabled at the server level, or it will not work and popup the warning message ​"Server crawler engine not enabled. Please check....".If you are using shared hosting server, please contact your hosting provider, or see [[litespeed_wiki:​cache:​lscwp:​configuration:​enabling_the_crawler|our instructions]]. +  - Crawler EngineThe crawler must be enabled at the server level, or you will see the warning message ​''​Server crawler engine not enabled. Please check....''​. If you are using shared hosting server, please contact your hosting provider, or see [[litespeed_wiki:​cache:​lscwp:​configuration:​enabling_the_crawler|our instructions]]. 
-  - SiteMap ​\\ Prepare your site's sitemap, e.g. ''​http://​prestashop-123/​456_sitemap.xml''​+  - SiteMapPrepare your site's sitemap, e.g. ''​<​nowiki>​http://​prestashop-123/​456_sitemap.xml</​nowiki>​''​
  
-===== How to use crawl script===== +===== How to Use the Crawler Script===== 
-[[DownLoad ​from here]] +[[https://​www.litespeedtech.com/​packages/​prestashop/​cachecrawler.sh | Download ​from here]]
-==== Crawl desktop view==== +
-  * ''​sh cachecrawler.sh SITE-MAP-URL ​    ''​ +
-==== Crawl desktop&​mobile view==== +
-  * ''​sh cachecrawler.sh SITE-MAP-URL -m  ''​+
  
-By default, Prestashop cache plugin Mobile ​view is DISABLED. ​  +Change the permissions so that the file is executable: ''​chmod +x cachecrawler.sh''​ 
-To enable mobile view is simpleAccess ​to PrestaShop ​web admin -> LiteSpeed Cache -> Configuration ​-> Separate Mobile View set to ''​Yes'' ​\\+ 
 +Crawl when desktop & mobile share the same theme: ''​bash cachecrawler.sh SITE-MAP-URL''​ 
 + 
 +Crawl when desktop & mobile have different themes: ''​bash cachecrawler.sh SITE-MAP-URL -m ''​ 
 + 
 +By default, ​in the Prestashop cache plugin Mobile ​View is DISABLED. To enable mobile view, navigate ​to **PrestaShop ​Admin -> LiteSpeed Cache -> Configuration** and set **Separate Mobile View** to ''​Yes''​
 {{:​litespeed_wiki:​cache:​lscps:​prestashop-8.png?​800|}} {{:​litespeed_wiki:​cache:​lscps:​prestashop-8.png?​800|}}
  
 ==== More Options==== ==== More Options====
-  * ''​sh cachecrawler.sh ​-h''​, For Helping purpose \\ +  * ''​-h, --help''​: Show this message and exit. 
-  * ''​sh cachecrawler.sh SITE-MAP-URL -i NUM'',​ will change default ​interval request from 0.1s to custom NUM value +  * ''​-m, --with-mobile'':​ Crawl mobile view in addition to default view. 
-===== How to generate sitemap===== +  * ''​-c, --with-cookie''​: Crawl with site's cookies. 
-Google ​sitemap ​module is quite popular for Prestashop SiteMap generate and much faster than online generation+  * ''​-b--black-list'':​ Page will be added to blacklist if HTML status error and no cache. Next run will bypass page. 
 +  * ''​-g,​ --general-ua'':​ Use general user-agent instead of lscache_runner for desktop view. 
 +  * ''​-i,​ --interval'':​ Change ​request ​interval. ''​-i 0.2''​ changes ​from default ​0.1 second ​to 0.2 seconds. 
 +  * ''​-v,​ --verbose'':​ Show complete response header under ''/​tmp/​crawler.log''​. 
 +  * ''​-d,​ --debug-url'':​ Test one URL directly. as in ''​sh cachecrawler.sh -v -d http://​example.com/​test.html''​. 
 +  * ''​-qs,​--crawl-qs'':​ Crawl sitemap, including URLS with query strings. 
 +  * ''​-r,​ --report'':​ Display total count of crawl result.
  
-==== Google sitemap==== +Example commands: ​ 
-For v1.6, Google sitemap Module was installed by default.+  * To get help: ''​bash cachecrawler.sh -h''​ 
 +  * To change ​default ​interval request from 0.1s to custom NUM value: ''​bash cachecrawler.sh SITE-MAP-URL -i NUM''​ 
 +  * To crawl with cookie set: ''​bash cachecrawler.sh -c SITE-MAP-URL''​ 
 +  * To store log in ''/​tmp/​crawler.log'':​ ''​bash cachecrawler.sh -v SITE-MAP-URL''​ 
 +  * To debug one URL and output on screen: ''​bash cachecrawler.sh -d SITE-URL''​ 
 +  * To display total count of crawl result: ''​bash cachecrawler.sh -r SITE-MAP-URL''​
  
-For v1.7+, Google sitemap Module need to install from source first. +NOTEUsing multiple parameters at the same time is allowed  
-Download [[https://​github.com/​PrestaShop/​gsitemap/​archive/​master.zip | gsitmap]]; then change file name to gsitemap.zip.+===== How to Generate a Sitemap===== 
 +The Google Sitemap module is quite popular for generating a sitemap in Prestashop, and it's much faster than online generation
  
-Click **Configure** buttonthen click xxx.sitemap.xml(This ​is your SITE-MAP-URL). \\ +==== Google Sitemap Module==== 
-{{:​litespeed_wiki:​cache:​lscps:​prestashop-9.png?600|}}+For v1.6Google Sitemap Module ​is installed by default.
  
-==== SiteMap Online Generator ==== +For v1.7+, Google Sitemap Module needs to be installed from source first
-One of the popular sitemap Generator is [[https://​www.xml-sitemaps.com/ | XML-Sitemaps.com]] +Download ​[[https://github.com/PrestaShop/gsitemap/archive/master.zip ​gsitemap]]; then change the file name to ''​gsitemap.zip''​.
-After crawl finished. Click **DOWNLOAD YOUR XML SITEMAP FILE** and put it where crawler script accessible. \\ +
-{{:​litespeed_wiki:​cache:​lscps:​prestashop-6.png?​600|}} +
- +
-===== How to verify===== +
-By using [[https://developers.google.com/web/tools/chrome-devtools/ | browser developer tool]], you should see a ''​X-LiteSpeed-Cache:​ hit'' ​at first view for both desktop and Mobile +
-  * Desktop view \\ {{:​litespeed_wiki:​cache:​lscps:​prestashop-4.png?​800|}} +
-  * Mobile view \\ {{:​litespeed_wiki:​cache:​lscps:​prestashop-7.png?​800|}}+
  
 +Click the **Configure** button, you will see e.g. ''​xxx/​1_index_sitemap.xml''​(This is your main SITE-MAP-URL,​ ). 
 +{{:​litespeed_wiki:​cache:​lscps:​ps-10.png?​600|}}
  
 +==== SiteMap Online Generator ====
 +One of the popular sitemap generators is [[https://​www.xml-sitemaps.com/​ | XML-Sitemaps.com]]
 +After the crawl is finished. Click **DOWNLOAD YOUR XML SITEMAP FILE** and put it where the crawler script can access it.
  
 +{{:​litespeed_wiki:​cache:​lscps:​prestashop-6.png?​600|}}
  
 +===== Crawl Interval =====
 +How often do we want to re-initiate the crawling process? This depends on how long it takes to crawl your site and what did you set for Public Cache TTL. \\
 +Default TTL is one day(24hr). Maybe you can consider to run the script by cronjob every 12 hours.\\
 +E.g. This will run twice a day, at 3:​30am/​15:​30:​ ''​30 3/15 * * * path_to_script/​cachecrawler.sh SITE-MAP-URL -m -i 0.2''​
  
 +Note: You can also use [[https://​crontab.guru/​|online crontab tool]] help you to verify time settings.
  
 +===== How to Verify =====
 +By using [[https://​developers.google.com/​web/​tools/​chrome-devtools/​ | the browser developer tool]], you should see ''​X-LiteSpeed-Cache:​ hit''​ at the first view for both desktop and Mobile
 +  * Desktop view \\ {{:​litespeed_wiki:​cache:​lscps:​prestashop-4.png?​800|}}
 +  * Mobile view \\ {{:​litespeed_wiki:​cache:​lscps:​prestashop-7.png?​800|}}
  
  • Admin
  • Last modified: 2020/08/11 19:17
  • by Lisa Clarke