Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
litespeed_wiki:cache:litemage2:crawler [2020/06/16 17:05]
Joshua Reynolds [Run crawler after any product update on Magento 2]
litespeed_wiki:cache:litemage2:crawler [2020/07/08 19:35]
Lisa Clarke Redirect to new Documentation Site
Line 1: Line 1:
-====== LiteSpeed Cache for Magento2: Crawler ====== +~~REDIRECT>​https://​docs.litespeedtech.com/​lscache/litemage/settings/~~
- +
-The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter un-cached pages. +
- +
-===== Before You Begin ===== +
-  - Install and enable [[https://​www.litespeedtech.com/​support/​wiki/​doku.php/​litespeed_wiki:​cache:​litemage2:​installation | LiteMage Cache for Magento2]] +
-  - Crawler Engine: The crawler must be enabled at the server level, or you will see the warning message ''​Server crawler engine not enabled. Please check....''​. If you are using a shared hosting server, please contact your hosting provider, or see [[litespeed_wiki:​cache:​lscwp:​configuration:​enabling_the_crawler|our instructions]]. +
-  - SiteMap: Prepare your site's sitemap, e.g. ''<​nowiki>​http://​magento2.com/​sitemap.xml</​nowiki>''​ +
- +
-===== How to Use the Crawler Script===== +
-  -[[https://www.litespeedtech.com/​packages/litemage2.0/​M2-crawler.sh | Download from here]] +
-  - Change the permissions so that the file is executable: ''​chmod +x M2_crawler.sh''​ +
-  - Run the script: ''​bash M2-crawler.sh SITE-MAP-URL''​ +
- +
-==== More Options==== +
-  * ''​-h,​ --help'':​ Show this message and exit. +
-  * ''​-m,​ --with-mobile'':​ Crawl mobile view in addition to default view. +
-  * ''​-c,​ --with-cookie'':​ Crawl with site's cookies. +
-  * ''​-b,​ --black-list'':​ Page will be added to blacklist if HTML status error and no cache. Next run will bypass page. +
-  * ''​-g,​ --general-ua'':​ Use general user-agent instead of lscache_runner for desktop view. +
-  * ''​-i,​ --interval'':​ Change request interval. ''​-i 0.2''​ changes from default 0.1 second to 0.2 seconds. +
-  * ''​-v,​ --verbose'':​ Show complete response header under ''/​tmp/​crawler.log''​. +
-  * ''​-d,​ --debug-url'':​ Test one URL directly. as in ''​sh M2-crawler.sh -v -d http://​example.com/​test.html''​. +
-  * ''​-qs,​--crawl-qs'':​ Crawl sitemap, including URLS with query strings. +
-  * ''​-r,​ --report'':​ Display total count of crawl result. +
- +
-Example commands:  +
-  * To get help: ''​bash M2-crawler.sh -h''​ +
-  * To change default interval request from 0.1s to custom NUM value: ''​bash M2-crawler.sh SITE-MAP-URL -i NUM''​ +
-  * To crawl with cookie set: ''​bash M2-crawler.sh -c SITE-MAP-URL''​ +
-  * To store log in ''/​tmp/​crawler.log'':​ ''​bash M2-crawler.sh -v SITE-MAP-URL''​ +
-  * To debug one URL and output on screen: ''​bash M2-crawler.sh -d SITE-URL''​ +
-  * To display total count of crawl result: ''​bash M2-crawler.sh -r SITE-MAP-URL''​ +
- +
-NOTE: Using multiple parameters at the same time is allowed  +
-===== How to Generate a Sitemap===== +
-Magento 2 has a builtin module for generating a sitemap and it's fast. +
- +
-==== Enable sitemap ==== +
-Navigate to **Magento Admin > Stores > Settings > Configuration > Catalog > XML Sitemap** +
-{{:​litespeed_wiki:​cache:​litemage2:​m2-4.png?​600|}} +
- +
-Set **Generation Settings > Enabled** to ''​Yes''​ +
-{{:​litespeed_wiki:​cache:​litemage2:​m2-5.png?​600|}} +
- +
-==== Configuring a Single Sitemap for All Storefronts ==== +
-Navigate to **Magento Admin > Marketing > Seo & Search > Sitemap** +
-  - Click the **Add Sitemap** button +
-  - Enter values +
-    * **Filename**:​ ''​sitemap.xml''​ +
-    * **Path**: ''/''​ +
-  - Click the **Save & Generate** button +
- +
-{{:​litespeed_wiki:​cache:​litemage2:​m2-2.png?​600|}} \\ +
-If all went well, a ''​sitemap.xml''​ file will have been generated in your Magento 2 document root. +
- +
-===== Crawl Interval ===== +
-How often do you want to re-initiate the crawling process? This depends on how long it takes to crawl your site and what you set for Public Cache TTL. +
- +
-The default TTL is one day(24hr). Maybe, for example, you'd like to run the script by cronjob every 12 hours instead. +
- +
-E.g. This will run twice a day, at 3:​30am/​15:​30:​ ''​30 3/15 * * * path_to_script/​M2_crawler.sh SITE-MAP-URL -m -i 0.2''​ +
- +
-Note: You can also use [[https://​crontab.guru/|online crontab tool]] to help you to verify the time settings+
- +
-===== Run crawler after any product update on Magento 2 ===== +
-In Magento 2, any product update will trigger all caches purged by the design of Magento 2. LiteMage doesn'​t have any control of this Magento 2 designed behavior. Therefore you may find pages uncached even you run above crawl interval less than TTL. It doesn'​t mean LiteMage 2 doesn'​t work well or Crawler doesn'​t work well. It is simply a Magento 2 design matter. +
- +
-To avoid the above situation, we would recommend you schedule a specific window of time to do any product changes through Magento admin. For example, two hours from 6:00pm to 8:00pm off-peak time. Then you run the crawler immediately after the change and the likelihood of users encountering uncached pages is kept to a minimum. +
-===== How to Verify the Crawler is Working ===== +
-When using [[https://​developers.google.com/​web/​tools/​chrome-devtools/​|the browser developer tool]], load a previously uncached page. You should see ''​X-LiteSpeed-Cache:​ hit,​litemage''​ on the first view. +
- +
-{{:​litespeed_wiki:​cache:​litemage2:​m2-3.png?​600|}}+
  • Admin
  • Last modified: 2020/07/08 19:35
  • by Lisa Clarke