Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
litespeed_wiki:cache:lscwp:crawler [2018/03/29 20:13]
Lisa Clarke Adjusted title
litespeed_wiki:cache:lscwp:crawler [2020/05/04 13:45]
Shivam Saluja
Line 1: Line 1:
 ====== LiteSpeed Cache for WordPress: Crawler ====== ====== LiteSpeed Cache for WordPress: Crawler ======
 +**Please Note**: This wiki is valid for v2.9.x and below of the LiteSpeed Cache Plugin for WordPress. If you are using v3.0 or above, please see [[https://​docs.litespeedtech.com/​lscache/​lscwp/​overview/​|the new documentation]].
 +
  
 The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter uncached pages. The crawler travels through your site, refreshing pages that have expired in the cache. This makes it less likely that your visitors will encounter uncached pages.
Line 9: Line 11:
  
 ===== Getting Ready to Run ===== ===== Getting Ready to Run =====
-{{:​litespeed_wiki:​cache:​lscwp:​lscwp-crawler-annotated.png?​direct&​800|}}+{{:​litespeed_wiki:​cache:​lscwp:​lscwp-crawler-annotated.png?​nolink|}}
  
   - If you don't already have a sitemap file, you can generate one here by pressing the **Generate Crawler File** button. (If you already have an XML sitemap, you can enter the URL on the Crawler settings tab.)   - If you don't already have a sitemap file, you can generate one here by pressing the **Generate Crawler File** button. (If you already have an XML sitemap, you can enter the URL on the Crawler settings tab.)
Line 17: Line 19:
  
 ===== Running the Crawler ===== ===== Running the Crawler =====
-{{:​litespeed_wiki:​cache:​lscwp:​lscwp-crawler-watch.png?​direct&​800|}}+{{:​litespeed_wiki:​cache:​lscwp:​lscwp-crawler-watch.png?​nolink|}} 
 + 
 +If you've opted to watch the crawler status, your screen will look something like the image above. The messages in the status window will vary from these, as this screenshot was grabbed from a small installation with few pages to crawl. 
 + 
 +Here's an example of a watch screen from a crawler running on a larger site: 
 + 
 +{{:​litespeed_wiki:​cache:​lscwp:​troubleshooting:​lscwp-crawler2.jpg?​nolink|}}
  
-If you've opted to watch the crawler ​status, your screen will look something like this. The messages in the status window ​will vary from these, as this screenshot ​was grabbed ​from a small installation with few pages to crawl.+And here is an explanation of some of the terms: 
 +  * ''​Size'':​ The number of URLs in the sitemap. This example has 181. 
 +  * ''​Crawler'':​ Indicates which crawler ​number you are watching. It's number 1 in this example. There could be multiple crawlers workingdepending on your settings. 
 +  * ''​Position'': ​The URL number currently being fetched from the sitemap list. 
 +  * ''​Threads'':​ Indicates the number of threads currently being used to fetch URLs. There may be multiple threads fetching. It is smart and will adjust based on your load [[litespeed_wiki:​cache:​lscwp:​configuration:​crawler|settings]]. 
 +  * ''​Status'':​ Indicates the current crawler status. In this example, ''​Stopped due to reset meta position''​ means that the site purged or the sitemap changed while it was crawling, and as such, the crawler will restart ​from the top.
  
 If you wish to keep a particular path from being crawled, you may enter it in the **Sitemap Generation Blacklist** box and press **Save**. After the crawler has run for the first time, if it encounters any pages marked ''​do-not-cache''​ they will be added to this Blacklist automatically. If you wish to keep a particular path from being crawled, you may enter it in the **Sitemap Generation Blacklist** box and press **Save**. After the crawler has run for the first time, if it encounters any pages marked ''​do-not-cache''​ they will be added to this Blacklist automatically.
  
  • Admin
  • Last modified: 2020/11/14 15:32
  • by Lisa Clarke