Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
litespeed_wiki:cache:litemage2:crawler [2018/07/27 15:37]
Eric Leu created
litespeed_wiki:cache:litemage2:crawler [2019/10/23 16:02]
Eric Leu [More Options]
Line 8: Line 8:
   - SiteMap: Prepare your site's sitemap, e.g. ''<​nowiki>​http://​magento2.com/​sitemap.xml</​nowiki>''​   - SiteMap: Prepare your site's sitemap, e.g. ''<​nowiki>​http://​magento2.com/​sitemap.xml</​nowiki>''​
  
-===== How to Use Crawl script===== +===== How to Use the Crawler Script===== 
-[[ | Download from here]] +  -[[https://​www.litespeedtech.com/​packages/​litemage2.0/​M2-crawler.sh ​| Download from here]] 
- +  ​- ​Change the permissions so that the file is executable: ''​chmod +x M2_crawler.sh''​ 
-Change the permissions so that the file is executable: +  - Run the script: ​''​bash M2-crawler.sh SITE-MAP-URL''​
-''​chmod +x cachecrawler.sh''​ +
- +
-==== Crawl Desktop&​mobile share same theme==== +
-''​sh M2-crawler.sh SITE-MAP-URL''​+
  
 ==== More Options==== ==== More Options====
-  * To get help: ''​sh M2-crawler.sh -h''​ +  ​* ''​-h,​ --help'' ​         Show this message and exit 
-  * To change default interval request from 0.1s to custom NUM value: ''​sh M2-crawler.sh SITE-MAP-URL -i NUM''​+  * ''​-m,​ --with-mobile'' ​  Crawl mobile view in addition to default view 
 +  * ''​-c,​ --with-cookie'' ​  Crawl with site's cookies 
 +  * ''​-b,​ --black-list'' ​   Page will be added to black list if html status error and no cache. Next run will bypas page 
 +  * ''​-g,​ --general-ua'' ​   Use general user-agent instead of lscache_runner for desktop view 
 +  * ''​-i,​ --interval'' ​     Change request interval. "-i 0.2" changes from default 0.1s to 0.2s 
 +  * ''​-v,​ --verbose'' ​      Show complete response header under /​tmp/​crawler.log 
 +  * ''​-d,​ --debug-url'' ​    Test one URL directly. "sh M2-crawler.sh -v -d http://​example.com/​test.html"​ 
 +  * ''​-qs,​--crawl-qs'' ​     Crawl sitemap, including URLS with query strings 
 +  * ''​-r,​ --report'' ​       Display total count of crawl result 
 +Example command:  
 +  ​* To get help: ''​bash M2-crawler.sh -h''​ 
 +  * To change default interval request from 0.1s to custom NUM value: ''​bash M2-crawler.sh SITE-MAP-URL -i NUM''​ 
 +  * To crawl with cookie set: ''​bash M2-crawler.sh -c SITE-MAP-URL''​ 
 +  * To store log in ''/​tmp/​M2-crawler.log'':​ ''​bash M2-crawler.sh -v SITE-MAP-URL''​ 
 +  * To debug one URL and output on screen: ''​bash M2-crawler.sh -d SITE-URL''​ 
 +  * To display total count of crawl result: ''​bash M2-crawler.sh -r SITE-MAP-URL''​ 
 +  * Use multiple parameters at the same time is allowed ​
  
 ===== How to Generate a Sitemap===== ===== How to Generate a Sitemap=====
-The Sitemap ​module ​is build-in ​for generating a sitemap ​in Magento 2, and it's fast. +Magento 2 has a builtin ​module for generating a sitemap and it's fast.
  
 ==== Enable sitemap ==== ==== Enable sitemap ====
-Navigate to Magento ​admin page -> Stores ​-> Settings ​-> Configuration ​-> Catalog ​-> XML Sitemap ​\\ +Navigate to **Magento ​Admin > Stores > Settings > Configuration > Catalog > XML Sitemap** 
-{{:​litespeed_wiki:​cache:​litemage2:​m2-4.png?​600|}} ​\\+{{:​litespeed_wiki:​cache:​litemage2:​m2-4.png?​600|}}
  
-Set Generation Settings Enabled to ''​Yes'' ​\\+Set **Generation Settings ​Enabled** to ''​Yes''​
 {{:​litespeed_wiki:​cache:​litemage2:​m2-5.png?​600|}} {{:​litespeed_wiki:​cache:​litemage2:​m2-5.png?​600|}}
  
-==== Configuring a single sitemap ​for all storefronts ​==== +==== Configuring a Single Sitemap ​for All Storefronts ​==== 
-Navigate to Magento ​admin page -> Marketing ​-> Seo & Search ​-> Sitemap +Navigate to **Magento ​Admin > Marketing > Seo & Search > Sitemap** 
-  - Click **Add Sitemap** button +  - Click the **Add Sitemap** button 
-  - Enter value +  - Enter values 
-    * Filename: ''​sitemap.xml''​ +    ​* **Filename**: ''​sitemap.xml''​ 
-    * Path: ''/''​ +    ​* **Path**: ''/''​ 
-  - Click **Save & Generate** button+  - Click the **Save & Generate** button
  
 {{:​litespeed_wiki:​cache:​litemage2:​m2-2.png?​600|}} \\ {{:​litespeed_wiki:​cache:​litemage2:​m2-2.png?​600|}} \\
-If all went well, a sitemap.xml file will generated in your magento ​2 document root.+If all went well, a ''​sitemap.xml'' ​file will have been generated in your Magento ​2 document root
 + 
 +===== Crawl Interval ===== 
 +How often do you want to re-initiate the crawling process? This depends on how long it takes to crawl your site and what you set for Public Cache TTL. 
 + 
 +The default TTL is one day(24hr). Maybe, for example, you'd like to run the script by cronjob every 12 hours instead. 
 + 
 +E.g. This will run twice a day, at 3:​30am/​15:​30:​ ''​30 3/15 * * * path_to_script/​M2_crawler.sh SITE-MAP-URL -m -i 0.2''​ 
 + 
 +Note: You can also use [[https://​crontab.guru/​|online crontab tool]] to help you to verify the time settings.
  
-===== How to Verify ===== +===== How to Verify ​the Crawler is Working ​===== 
-By using [[https://​developers.google.com/​web/​tools/​chrome-devtools/​ | the browser developer tool]], ​you should see ''​X-LiteSpeed-Cache:​ hit,​litemage'' ​at the first view \\+When using [[https://​developers.google.com/​web/​tools/​chrome-devtools/​|the browser developer tool]], ​load a previously uncached page. You should see ''​X-LiteSpeed-Cache:​ hit,​litemage'' ​on the first view.
  
 {{:​litespeed_wiki:​cache:​litemage2:​m2-3.png?​600|}} {{:​litespeed_wiki:​cache:​litemage2:​m2-3.png?​600|}}
  • Admin
  • Last modified: 2020/07/08 19:35
  • by Lisa Clarke