[Solved] How to warm up LiteSpeed LScache of URLs that are not in the sitemap

serpent_driver

Well-Known Member
#1
How to warm up the cache of URLs that are not in the sitemap.

If you use one of the LiteSpeed Cache Plugins for WordPress, OpenCart, PrestaShop or another LiteSpeed Cache Plugin, then you are certainly familiar with the problem that the included cache crawlers can only warm up the cache of the URLs that are in the sitemap. Since the sitemap only contains the URLs required for SEO purposes, the sitemap inevitably misses a lot of URLs, such as for pagination or filters or other URLs that have a GET parameter, for example.

For LScache, it doesn't matter whether the URLs from the sitemap are used for the cache warmup. As long as the URL is a dynamically generated PHP source, the cache can be used for any type of URL or page. However, if a lot of URLs are missing from the sitemap, then you cannot take advantage of an HTTP cache or cache warmup. After all, a page must first be requested in order to cache it.

However, this significant problem can be solved very easily.

Before I present the solution to this problem, I want to first describe another problem, but one that is closely related to the problem with warming up URLs that are not in the sitemap.

Almost every user who uses LiteSpeed LScache believes that it is necessary to warm up the cache of all URLs from the sitemap. Unfortunately, this is a common misconception that can easily be proven by analyzing your site's traffic. This makes it very easy and quick to see that up to 70% of all URLs listed in the sitemap are either never or only very rarely requested. This applies not only to natural visitors, but also to bots, especially Googlebot. Google operates crawling on demand. This means that Google does not crawl all URLs on a page and does not index all URLs from the sitemap, but only crawls the URLs for which there is interest. At Google, efficiency is what counts and is therefore able to tell from the URL what content a URL is about. This means Google doesn't have to first crawl a URL and analyze the content to determine whether the content is worth including in the search index.

Why waste resources on cache warmup if no one benefits from it?

Given this, the logical conclusion is that warming up the cache of URLs is a waste of resources if neither users nor bots never or only very rarely request these URLs. You don't have to reinvent the wheel to solve this problem. It is enough to copy Google's methodology and use this methodology for the cache warmup strategy. The result of this strategy is a better, faster and resource-saving result without any disadvantages.

Ultimately, the cache warmup is not an uncritical process and puts a lot of strain on shared hosting in particular. Even if you use a dedicated server, the cache warmup is critical because it simply takes too long if all URLs have to be crawled with every cache warmup.

So what is the solution?

The principle of this solution is to warm up the cache of only the URLs that are regularly visited by users. But that would require that you monitor and save the requested URLs and generate a custom sitemap from them. Technically speaking, this isn't a problem, but if you're not a programmer, this solution won't help you.

If you're not a programmer, you don't have to despair. There is a "URLs Most Wanted Plugin", UMW plugin for short, for WordPress and the LiteSpeed LScache plugin. What this plugin does can be seen from the name of the plugin. It takes advantage of LiteSpeed and LScache in a way that is unknown to most and records the URLs that visitors visit, saves these URLs and generates a custom sitemap from the saved URLs, which can be stored in the LScache plugin for WP. To ensure that the URLs requested by a user are not saved multiple times and to prevent URLs from being incorrectly recognized as a most wanted URL, the UMW plugin uses a cache technique that is never or only rarely used in practice, which ensures that by recording the URLs do not create any additional server load. The UMW plugin is an "Install and Forget Plugin". This means that it does not require any administration, as after installation it only requires a system cron job and storing the custom sitemap URL in the cache plugin for WordPress. The custom sitemap can be used for cache warmup in addition to the default sitemap, but the custom sitemap makes the default sitemap unnecessary.

The UMW plugin for the LiteSpeed Cache plugin is free, licensed under the GPL and can be downloaded directly from cachecrawler.com.

https://www.cachecrawler.com/WP-Plugins/WP-Plugin-URLs-Most-Wanted::6571.html


However, the UMW plugin only offers basic functions. If you want even more convenience, even more efficiency and an even faster cache warmup, then use the Kitt Cache Crawler. Kitt is a cache warmer specifically programmed for LScache, which is not only significantly faster than any of the LiteSpeed crawlers, but also only uses up to half the resources for the cache warmup. The list of features of this crawler is almost endless. You can find details about this crawler at https://www.cachecrawler.com. It is important to note that there is a specially adapted version of the Kitt Cache Crawler for each CMS.

Kitt also has a UMW function, but it has additional features.

Is the UMW plugin also available for OpenCart, PrestaShop, Magento or Shopware?

The UMW function is also available for OpenCart, PrestaShop, Magento and Shopware. However, not in the form of a plugin or extension, but rather is part of the respective Kitt Cache Crawler version.

**********************************************************************************************************************************************************************************
Join https://www.cachecrawler.com - Lightning fast Cache Warmup Crawler for Wordpress, OpenCart, PrestaShop, W3 Total Cache, WP Rocket, Shopware and Magento.

 
Last edited:
#2
This means that it does not require any administration, as after installation it only requires a system cron job and storing the custom sitemap URL in the cache plugin for WordPress. The custom sitemap can be used for cache warmup in addition to the default sitemap, but the custom sitemap makes the default sitemap unnecessary.

Could YOU post YOUR research here using the example of processing a highly loaded site - which does not use the software that you advertise?

tabular indicators - before installing the plugin
and after installation - after a certain time.

Thanks for understanding.
I used an automatic translator
 

serpent_driver

Well-Known Member
#3
@anna_ch

The difference between with and without the UMW plugin results from the way the UMW plugin works.

Example:
  • If you have 1000 URLs in the sitemap, then 1000 URLs will be crawled. (without UMW Plugin)
  • If only half of the 1000 URLs are accessed by visitors, then only 500 URLs need to be crawled. (with UMW Plugin)
 
Top