Re: "Web Address processing" (ABNF, processing proposal)

Anne van Kesteren wrote:
> On Fri, 12 Feb 2010 17:24:37 +0100, Julian Reschke 
> <julian.reschke@gmx.de> wrote:
>> Erik van der Poel wrote:
>>> Hi Julian,
>>>  I believe specs should not only be written for content producers
>>> (authors and authoring tool developers), but also for content
>>> consumers, such as browser developers. So, yes, these specs are
>>> needed. The question may be whether these items should be "moved" from
>>> URL/URI/IRI to HTML5.
>>>  Erik
>>
>> Yes.
>>
>> I think extracting a Web Address from HTML could be described in the 
>> HTML spec. That would include stripping leading and trailing whitespace.
>>
>> The result from that should be processable using *something* in the 
>> IRI spec.
> 
> Stripping also needs to be done in XMLHttpRequest, CSS, etc.

"needs"? It could, but it doesn't "need" to.

I just checked CSS for the syntax, and found:

"The format of a URI value is 'url(' followed by optional white space 
followed by an optional single quote (') or double quote (") character 
followed by the URI itself, followed by an optional single quote (') or 
double quote (") character followed by optional white space followed by 
')'. The two quote characters must be the same." -- 
<http://www.w3.org/TR/CSS2/syndata.html#value-def-uri>

So trimming the IRI certainly does not "need" to be done in the IRI 
spec; it's simply your preference.

We discussed earlier the use of white-space separated lists of IRIs in 
HTML5 (I think on IRC). Could you please elaborate why you think it's 
acceptable for the spec to mandate "split on whitespace", but then it's 
unacceptable to say "trim whitespace"? As far as I can tell, both fall 
into the same category of preprocessing input.

As a matter of fact, HTML5 already defines processing for many 
attributes to allow leading/trailing whitespace (see 
<http://dev.w3.org/html5/spec/Overview.html#skip-whitespace>). I'm not 
sure why this is ok in some cases, but isn't in other cases.


Best regards, Julian

Received on Friday, 12 February 2010 20:23:51 UTC