Re: several fixes from Terje Bless on 2001-04-17 (www-validator@w3.org from April 2001)

From: Terje Bless <link@tss.no>
Date: Wed, 18 Apr 2001 01:48:45 +0200
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-validator@w3.org
Message-ID: <20010418015126-b01010701-dfaf7615@192.168.1.6>
On 17.04.01 at 23:17, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

>* Terje Bless wrote:
>>>  * don't claim to be redirected from http://host or http://host:80/ to
>>>    http://host/ -> uses URI::eq()
>>
>>But lack of a trailing slash _is_ a redirect!
>
>http://host and http://host/ are equivalent, the trailing slash may be
>omitted here, see RFC 2616.

Oh, right. RFC 2616[0] sez:

# 3.2.3 URI Comparison
#
#   When comparing two URIs to decide if they match or not [...]
#
#        - An empty abs_path is equivalent to an abs_path of "/".

Thanks for finding this!


>  http://www.w3.org/TR redirects to http://http://www.w3.org/TR/
>
>URI::eq returns true for the first case, false for the second.

Yeah, we really should take advantage of more of the CGI(LWP/URI libraries.



>>>  * un-break /referer;ss etc. -> lets CGI.pm parse the trailing part
>>
>>Hmm, this was dropped back when we first moved to CGI.pm partially to
>>discourage that syntax as it's indescribably ugly IMO. Wanna "think
>>aloud" about why you think it's a good idea?
>
>E.g. if you want to present the outline of the document, specify a
>doctype, show the source, etc.pp.

Granted that it might be usefull, but I don't like it. Maybe if you managed
to get lstein to handle it transparently in CGI.pm[1], but I really (really
really) don't want to parse CGI in core. One reason for that is that I
don't really feel certain what the correct interpretation is in various
cases. Another is that "/referer" is a kludge left in for backwards
compatibility; it should have been moved to a CGI parameter or made into a
"magic" URL. I certainly don't want to _add_ features to it.

I'm open to being convinced otherwise -- and Gerald may feel differently --
but for right now I'm against it.


>>http://foo.com --> http://foo.com/
>
>Yes, that trailing slash can be omittet, I don't see any good reason to
>add the slash here,

Sold! The extra slash is history. :-)


>The above can be replaced with URI::Heuristic by the way. In general,
>I'd better use one URI object instead of all those $q->param('uri')
>calls.

The CGI object should be destroyed as soon as we're finished with it. All
parameters and such should be put in a hash (or similar) with complex types
implemented as lightweigth objects (such as URI).


>>>  * prevent caching of /referer documents
>>>    -> what about using a HTTP::Headers object for the header instead of
>>>       a simple string?
>>
>>I'm not following you here. Where do you want to use HTTP::Header, why,
>>how, and for what?
>
>Currently the header is saved like
>
>my $header = <<"EOF";
>Content-Type: text/html; charset=utf-8
>
>....
>
>and printed where needed, if we want to add caching headers, it would be
>better to use a HTTP::Headers object here instead of a plain scalar.

It's not really the HTTP header we're storing in $header; it's the _HTML_
header. The HTTP header is just convenient to place there. This all is
going away when we throw in a template system. At about that time it would
make sense to use HTTP::Header objects to make HTTP headers intelligent and
convenient.


>
>>>  * what about using a XML validator without the limitations of nsgmls
>>>    to validate XML documents?
>>
>>Such as? :-)
>
>http://www.stg.brown.edu/service/xmlvalid/dist/
>http://www.cogsci.ed.ac.uk/~richard/rxp.html

I'm having some trouble wrapping my head around the Python, but rxp is on
the list of stuff to check out to get Schema support. I just haven't gotten
around to it yet and there is some infrastructure support that needs to be
added first.


>XML::LibXML

Doesn't do what we need and libxml2 is incomplete.


>XML::Xerces

YM Xerces-p; not ready for prime-time. I've scheduled a reevaluation when
it gets Schema support ported from Xerces-c (which is being ported from
Xerces-j ;D) sometime this fall (hopefully).


I'm going to look at XML::Parser again at some point. It should be possible
to write a (general purpose) validating parser on top of it and then use
that in the validator, but time has been the constraint so far.


>>>-    Or add a "show source iff errors" option?
>>>+    Or add a "show source if errors" option?
>>
>>Not sure this is a typo. "iff" == "if and only if"
>
>Well, this wasn't covered by any of the 3 dictionaries if searched
>through...

I'm just guessing because I've always assumed that's why it's there --
using "iff" in that way is commonplace to me as technical jargon -- but
it's Gerald that wrote it so he'll have to clear that one up. :-)
Received on Tuesday, 17 April 2001 19:51:30 UTC