Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
litespeed_wiki:cache:litemage2:crawler [2019/10/23 16:02] Eric Leu [More Options] |
litespeed_wiki:cache:litemage2:crawler [2020/06/01 15:34] Jackson Zhang [Crawl Interval] |
||
---|---|---|---|
Line 14: | Line 14: | ||
==== More Options==== | ==== More Options==== | ||
- | * ''-h, --help'' Show this message and exit | + | * ''-h, --help'': Show this message and exit. |
- | * ''-m, --with-mobile'' Crawl mobile view in addition to default view | + | * ''-m, --with-mobile'': Crawl mobile view in addition to default view. |
- | * ''-c, --with-cookie'' Crawl with site's cookies | + | * ''-c, --with-cookie'': Crawl with site's cookies. |
- | * ''-b, --black-list'' Page will be added to black list if html status error and no cache. Next run will bypas page | + | * ''-b, --black-list'': Page will be added to blacklist if HTML status error and no cache. Next run will bypass page. |
- | * ''-g, --general-ua'' Use general user-agent instead of lscache_runner for desktop view | + | * ''-g, --general-ua'': Use general user-agent instead of lscache_runner for desktop view. |
- | * ''-i, --interval'' Change request interval. "-i 0.2" changes from default 0.1s to 0.2s | + | * ''-i, --interval'': Change request interval. ''-i 0.2'' changes from default 0.1 second to 0.2 seconds. |
- | * ''-v, --verbose'' Show complete response header under /tmp/crawler.log | + | * ''-v, --verbose'': Show complete response header under ''/tmp/crawler.log''. |
- | * ''-d, --debug-url'' Test one URL directly. "sh M2-crawler.sh -v -d http://example.com/test.html" | + | * ''-d, --debug-url'': Test one URL directly. as in ''sh M2-crawler.sh -v -d http://example.com/test.html''. |
- | * ''-qs,--crawl-qs'' Crawl sitemap, including URLS with query strings | + | * ''-qs,--crawl-qs'': Crawl sitemap, including URLS with query strings. |
- | * ''-r, --report'' Display total count of crawl result | + | * ''-r, --report'': Display total count of crawl result. |
- | Example command: | + | |
+ | Example commands: | ||
* To get help: ''bash M2-crawler.sh -h'' | * To get help: ''bash M2-crawler.sh -h'' | ||
* To change default interval request from 0.1s to custom NUM value: ''bash M2-crawler.sh SITE-MAP-URL -i NUM'' | * To change default interval request from 0.1s to custom NUM value: ''bash M2-crawler.sh SITE-MAP-URL -i NUM'' | ||
* To crawl with cookie set: ''bash M2-crawler.sh -c SITE-MAP-URL'' | * To crawl with cookie set: ''bash M2-crawler.sh -c SITE-MAP-URL'' | ||
- | * To store log in ''/tmp/M2-crawler.log'': ''bash M2-crawler.sh -v SITE-MAP-URL'' | + | * To store log in ''/tmp/crawler.log'': ''bash M2-crawler.sh -v SITE-MAP-URL'' |
* To debug one URL and output on screen: ''bash M2-crawler.sh -d SITE-URL'' | * To debug one URL and output on screen: ''bash M2-crawler.sh -d SITE-URL'' | ||
* To display total count of crawl result: ''bash M2-crawler.sh -r SITE-MAP-URL'' | * To display total count of crawl result: ''bash M2-crawler.sh -r SITE-MAP-URL'' | ||
- | * Use multiple parameters at the same time is allowed | ||
+ | NOTE: Using multiple parameters at the same time is allowed | ||
===== How to Generate a Sitemap===== | ===== How to Generate a Sitemap===== | ||
Magento 2 has a builtin module for generating a sitemap and it's fast. | Magento 2 has a builtin module for generating a sitemap and it's fast. | ||
Line 63: | Line 64: | ||
Note: You can also use [[https://crontab.guru/|online crontab tool]] to help you to verify the time settings. | Note: You can also use [[https://crontab.guru/|online crontab tool]] to help you to verify the time settings. | ||
+ | ===== Run crawler after any product update on Magento 2 ===== | ||
+ | In Magento 2, any product update will trigger all caches purged by the design of Magento 2. LiteMage doesn't have any control of this Magento 2 designed behavior. Hence you may find pages uncached even you run above crawl interval less than TTL. It doesn't mean LiteMage 2 doesn't work well or Crawler doesn't work well. It is simply a Magento 2 design matter. | ||
+ | |||
+ | To avoid above situation, we would recommend you schedule a specific windows to do any product changes through Magento admin. For example, two hours from 6:00pm to 8:00pm offpeak time. Then you run the crawler immediately after the change. | ||
===== How to Verify the Crawler is Working ===== | ===== How to Verify the Crawler is Working ===== | ||
When using [[https://developers.google.com/web/tools/chrome-devtools/|the browser developer tool]], load a previously uncached page. You should see ''X-LiteSpeed-Cache: hit,litemage'' on the first view. | When using [[https://developers.google.com/web/tools/chrome-devtools/|the browser developer tool]], load a previously uncached page. You should see ''X-LiteSpeed-Cache: hit,litemage'' on the first view. | ||
{{:litespeed_wiki:cache:litemage2:m2-3.png?600|}} | {{:litespeed_wiki:cache:litemage2:m2-3.png?600|}} |