Context without trailing slash is ignored

Marcus

Well-Known Member
#1
Hi,

I have defined a context /login, with some settings and rewrite rules inside. When I try to access the page at

http://mysite.com/login

the URL is checked in the rewrite rules at the VHOST level, but not the context level. A redirection is then sent to

http://mysite.com/login/

and the rewrite rules are checked again, but now, since there is no trailing slash (and I defined my context as /login not /login/, the context is ignored again).

These are the rewrite entries in my log

Code:
2009-01-09 05:41:47.279 [INFO] [127.0.0.1:53733-0#simpl-ssl] [REWRITE] Rule: Match '/login' with pattern '^/admin(.*)$', result: -1
2009-01-09 05:41:47.284 [INFO] [127.0.0.1:53733-1#simpl-ssl] [REWRITE] Rule: Match '/login/' with pattern '^/admin(.*)$', result: -1
2009-01-09 05:41:47.284 [INFO] [127.0.0.1:53733-1#simpl-ssl] [REWRITE] strip base: '/login/' from URI: '/login/'
This doesn't seem right to me. I've looked for a setting to disable automatically adding directory slashes, but it seems to me that the context /login has been ignored when it shouldn't have.

Is this the correct behaviour, or is it a bug?

I've just downloaded the latest 4.0b3, and the above behaviour is what I experienced.
 

Marcus

Well-Known Member
#2
The rewrite rules for for context /login are:

Code:
RewriteCond    %{HTTP_COOKIE}    lang=(en|fr|tr)
RewriteRule    ^(.*)$            $1/%1.shtml        [L,NE]

RewriteRule    ^(.*)$            $1/en.shtml        [L,NE]
and the full log entries are:

Code:
2009-01-09 06:20:08.174 [INFO] [127.0.0.1:39319-0#simpl-ssl] [REWRITE] Rule: Match '/login' with pattern '^/admin(.*)$', result: -1
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] Rule: Match '/login/' with pattern '^/admin(.*)$', result: -1
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] strip base: '/login/' from URI: '/login/'
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] Rule: Match '' with pattern '^(.*)$', result: 2
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] Cond: Match 'lang=en' with pattern 'lang=(en|fr|tr)', result: 2
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] Source URI: '' => Result URI: '/en.shtml'
2009-01-09 06:20:08.180 [INFO] [127.0.0.1:39319-1#simpl-ssl] [REWRITE] Last Rule, stop!
Upon closer inspection, it appears that after the redirection to /login/ the context /login is used, and the URL rewritten, however neither the prefix /login nor /login/ are added after having been stripped (as they are with other rewritten URLs).
 

Marcus

Well-Known Member
#3
Regex contexts also don't seem to work

I've created a context

exp:^/admin

but the rewrite rules inside this context also don't seem to be being processed at all, though if they're defined at the VHOST level they are.
 

mistwang

LiteSpeed Staff
#4
How is the /login context being defined? Static? where does it point to? a directory? a file?

I think you should point it to a file instead of directory to avoid adding the '/'.

The order of regex context is important, move it to the top with lower sequence number.
 

Marcus

Well-Known Member
#5
The context is defined as 'static'.

Eventually it will point to a file, but I want to use the 'nice' URL of /login rather than /login.php or something like that.

There is a directory /login/, but that just contains the SHTML files. I therefore can't point /login directly to a file, because of the 'directories are files' system on POSIX.

I could just leave the rewrite rules at the VHOST level, but there are reasons why I don't want to do that - I have a number of rules which do things like setting the language on a context-level. Because of the limitations of Apache's rewrite system, there are some things I can't do without re-reading the URL (which is a waste of processing power), which within a 'context' would not be necessary.

Regardless, though, I would have thought that if you define a context as exp:^/admin or /login that if the URL matches that regex, the settings contained within that context should be processed. The adding of directory slashes should happen after all other routes (e.g. context/rewrite) have been checked, not before.
 

mistwang

LiteSpeed Staff
#6
It is a bad idea to have a context overlap with a directory. For example, if you have a URL like "/login/password.shtml", should LSWS map it to a file, or "/login" context with PATH_INFO "/password.shtml"? Both are valid. So, you should avoid that.

If you define "/login" context without specify a "location", LSWS will use the "login" directory under document root as "Location" by default, that's why LSWS add the trailing slash, if you set "location" to a file, LSWS will not add it.
 

Marcus

Well-Known Member
#7
I didn't realise you needed to specify the location as a file. It doesn't seem right to me that I should have to, though. It should be purely on the basis of the URL (IMHO).

If you say that the following are true

- /login matches exactly the URL /login and nothing else (though maybe including the query string)
- /login/ matches the directory /login/ and files below it, and
- exp:^/login matches URLs that begin with /login (including /login, /login/, /loginpage ...)

Then it's up to the site developer to choose how they implement things, giving lots of flexibility.

I know that rewrite rules can be applied at the VHOST level, but as mentioned before, Apache's rewrite system is inefficient in some places, which can result in needing to do the same check over and over (although some pseudo if/elseif/else statements are possible, they are limited and that exact concept doesn't work. There are aesthetic reasons why I want to use /login and not /login/. Since the URLs are rewritten and mapped somewhere else, it shouldn't matter what the URL is, even if it does 'overlap' with a directory.

I understand that you may feel that certain URL schemas may be bad practice, but that's your choice. I personally don't consider the above to be bad practice, so long as the contexts are handled sensibly, and at the moment, it doesn't seem that they are - at least the ones defined without trailing slashes (but not pointing to a file), and those defined by regular expressions.

I think that LSWS should be deciding how to deal with URLs purely on that basis, and you shouldn't need to necessarily map the location to somewhere on the filesystem to get things to work properly.
 

Marcus

Well-Known Member
#8
It is a bad idea to have a context overlap with a directory. For example, if you have a URL like "/login/password.shtml", should LSWS map it to a file, or "/login" context with PATH_INFO "/password.shtml"? Both are valid. So, you should avoid that.
I feel that /login/password.shtml shouldn't be mapped to /login, but it should be mapped to /login/ and exp:^/login (or any other sensible regexes).

I just checked, and see that if you don't specify a location for regexes and if the specified location doesn't exist at the time of loading the configuration, the context configuration isn't loaded. I feel this is a bad idea. What about if you specify a valid location etc, but for some reason a directory gets deleted or renamed accidentally or temporarily. Later, it is restored, but in the meantime the server has been restarted, and unknown to the admin staff, the context settings are not being used. This could easily cause problems, and more problems than just say putting a [WARNING] entry in the log, but still loading the configuration and displaying a 404 when the URL isn't mapped.

I don't feel the current procedure would be natural to most people.
 

mistwang

LiteSpeed Staff
#9
LiteSpeed always use the first context matched, in our case, without specifying the location, that "/login" context was associated with login/ directory, trailing slash was added automatically because of that.

When there are multiple possible choices, LSWS always use the first match, and Apache compatibility play a role as well, when there are multiple choices, some users may want one way, while other users may want the other, not possible to make everyone happy, so I strongly recommend avoiding this kind situations when possible, if not possible, the user need to adjust configurations by following priorities assigned to different type of configuration.

Your point regarding how context configuration should work with missing directories is valid, we will consider making the change in future release.
 

Marcus

Well-Known Member
#10
I fully understand that one of your aims is to be Apache-interchangeable, and if Apache follows the same method regarding directories, then it is sensible for you to stay with your current approach. Also, I understand you won't want to change things if it is likely/possible to cause problems (albeit short-term) problems for any of your customers who upgrade.

Thank you for considering making some changes.
 

Tony

Well-Known Member
#11
The big reason I use LiteSpeed is due to the compability being very close to Apache in everyway. So anything going opposite of Apache is tough for me even if Apache's way makes no sense what so ever.
 
Top