CACHE

JOHN GALT · Jul 30, 2023

My paging is not in wordpress.. The paging happens dynamically on a single wordpress page.

serpent_driver · Jul 30, 2023

What is the difference between "is not in wordpress" and a "single Wordpress page"? Can you please describe that in a bit more detail?

AndreyPopov · Jul 30, 2023

JOHN GALT said:

all depends from what CMS you use.

crawler from plugin for some CMS can build sitemap and recache it
some CMS plugin require generated sitemap and provide link to it
third party crawler also require generated sitemap and provide link to it

some crawlers can recache &page=... , but usually required manually recache for pagination

my example:
https://www.litespeedtech.com/suppo...rawler-for-recache-some-ideas-and-code.19763/

I modified internal crawler

some ideas developers implement to crawler, some ideas stay non-implemented

AndreyPopov · Jul 30, 2023

serpent_driver said:

yes, basically HTML is a text file, but NOT plain text!

it rendered by the browser not only because of the file extension!
if you change extension of plain text file to html - it NOT rendered because NOT contain HTML formatting tags!
and most of SEO named pages NOT contain extension, but HOW browser render they?

JOHN GALT · Jul 30, 2023

In wordpress if you build a page with dynamic data the data will automatically build and pagination will be added to a single page. You dont have pages in Wordpress abc.com, abc.com/p1, abc.com/p2. Its a single page. On that single page you could have 3 pages 15 pages 40 pages. Its all based on the dynamic data. How do you tell Lite-speed Cache to follow the paging so it can fetch all of those pages and cache them.

JOHN GALT · Jul 30, 2023

Another way to do things is if you hit the main page, you have a way to trigger the other pages on the fly. So another words you have abc.com in the sitemap. When a user hits that page it then triggers off a way to cache the rest of the pages in the background. That would wok as well

serpent_driver · Jul 31, 2023

JOHN GALT said:

There's nothing you can tell LScache about. LScache is not a bot that follows every link. LScache is just a cache engine.

But I understand what you are talking about and there are solutions for that, but these solutions are not part of LScache or a plugin or Wordpress. Either you have such a solution programmed for you or you use ready-made software that is not actually intended for this, but can be misused. What I mean by that is a crawler script that works similar to a search engine bot and automatically follows every link and generates a sitemap from the crawl result, which you can then use in the LIteSpeed Cache plugin.

serpent_driver · Jul 31, 2023

JOHN GALT said:

But only in theory.

There is another and much better solution. As a website operator, you mistakenly assume that all of your pages are requested by users. If you use tracking software and deal extensively with the evaluation of the tracking analysis, you will find that a very high proportion of your pages are either very rarely or never requested by users. So the question inevitably arises, why spend resources on the cache warmup if no one is requesting these pages?

And that against the background that the cache warmup can sometimes take a very long time and consumes a lot of resources or can generate a high load. This is particularly critical for shop pages, because with most cache plugins the cache is purged after changes to a product or when a product is purchased. It is often the case that the crawler for the cache warmup has not yet finished crawling, but in the meantime the cache of pages that have already been crawled has already been purged again. This way of working is therefore not very economical.

The solution is to track the URLs that users request, so you know which URLs to warm up the cache for. In my case and on my website, that's less than 10% of all available URLs. That's why the cache process takes me just 10 minutes and not hours or even a whole day. However, I don't use the LiteSpeed crawler, which unfortunately wasn't programmed very carefully. I have my own custom solution that is x times faster and generates only half the load.

serpent_driver · Jul 31, 2023

AndreyPopov said:

How did you come up with this idea? If I write HTML code in a .txt file and change the file extension to .html, what happens when I request that file in the browser?

A plain text file is not defined by the content but by the http header and with this header I tell the browser how to handle the file.

serpent_driver · Jul 31, 2023

@AndreyPopov

Have a look at the two attachments. These are zipped cache files from static sources (CSS and JS), cached by LScache. So forget the fairy tale that cache file is a "HTML screenshot".

JOHN GALT · Jul 31, 2023

How do I download it. I would like to check it out

serpent_driver said:

But only in theory.

There is another and much better solution. As a website operator, you mistakenly assume that all of your pages are requested by users. If you use tracking software and deal extensively with the evaluation of the tracking analysis, you will find that a very high proportion of your pages are either very rarely or never requested by users. So the question inevitably arises, why spend resources on the cache warmup if no one is requesting these pages?

And that against the background that the cache warmup can sometimes take a very long time and consumes a lot of resources or can generate a high load. This is particularly critical for shop pages, because with most cache plugins the cache is purged after changes to a product or when a product is purchased. It is often the case that the crawler for the cache warmup has not yet finished crawling, but in the meantime the cache of pages that have already been crawled has already been purged again. This way of working is therefore not very economical.

The solution is to track the URLs that users request, so you know which URLs to warm up the cache for. In my case and on my website, that's less than 10% of all available URLs. That's why the cache process takes me just 10 minutes and not hours or even a whole day. However, I don't use the LiteSpeed crawler, which unfortunately wasn't programmed very carefully. I have my own custom solution that is x times faster and generates only half the load.

View attachment 3179

View attachment 3181

serpent_driver · Jul 31, 2023

JOHN GALT said:

Coming soon. If released you can download it at https://www.cachecrawler.com. The current version is for my Kitt Cache Crawler and still have to modify it for native usage without Kitt Cache Crawler.

AndreyPopov · Jul 31, 2023

serpent_driver said:

@serpent_driver
I look my lsache folder and find ONLY HTML shapshots

and that's why only you talk fairy tales.

AndreyPopov · Jul 31, 2023

serpent_driver said:

but these pages already cached by user often requests. why need to crawl they again?

serpent_driver · Jul 31, 2023

Of course you need a corresponding PHP function that generates CSS or JS files. For me, no CSS or JS files are generated, but combined. Only then is it possible to cache static sources. However, to be fair, it has to be said that there is no advantage to doing so, because when the respective source is loaded by the browser, it is in the browser cache. The static source caching example was only meant to show you that it's not just about HTML.

That's why you're still someone who likes to tell fairy tales.

AndreyPopov · Jul 31, 2023

serpent_driver said:

you by yourself answer for your stupid "plain text" words.

serpent_driver · Jul 31, 2023

AndreyPopov said:

It's not about crawling the URLs again. You have to read it properly and try to understand it. The first thing to do is to record (track) the URLs that are requested by users and generate a sitemap based on this, so that the crawler only crawls the URLs that are actually requested. By the way, search engines are excluded from this.

serpent_driver · Jul 31, 2023

AndreyPopov said:

For once I agree with you. The term plain text is wrongly chosen in this context. But that doesn't change the fact that an appropriate header is needed to tell the browser what content-type it is.

AndreyPopov · Jul 31, 2023

JOHN GALT said:

infinitive scroll and technics like "Lazy Load" cannot be cached by crawler by default.
you need manually or by third party code build links for recache.

serpent_driver · Jul 31, 2023

AndreyPopov said:

You don't need anything at all for lazy load because the loading="lazy" attribute has been a standard feature of almost every modern browser for quite some time. A so-called 1-pager can of course be cached. What you probably mean is the occasionall oading of content when the content to be loaded comes into the viewport, which serves to make pagination superfluous.

CACHE

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Attachments

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member