W3C home > Mailing lists > Public > www-talk@w3.org > January to February 2009

Re: host-meta file format comments (draft-nottingham-site-meta-01)

From: Mark Nottingham <mnot@yahoo-inc.com>
Date: Wed, 11 Feb 2009 12:18:43 +1100
Cc: <www-talk@w3.org>, Eran Hammer-Lahav <blade@yahoo-inc.com>, <discuss@apps.ietf.org>
Message-Id: <CB0A2D28-3E76-412B-BC67-4C78284BF91A@yahoo-inc.com>
To: Thomas Roessler <tlr@w3.org>


On 11/02/2009, at 12:05 PM, Thomas Roessler wrote:

> (diverting to www-talk, too...)
>
> On 11 Feb 2009, at 01:20, Mark Nottingham wrote:
>
>> Yeah, I'm not completely happy with it yet. The thought was that  
>> since blank lines don't introduce ambiguity here, they're not  
>> harmful. OTOH one of my goals for the format is to allow existing  
>> HTTP header and MIME parsers (e.g., in Python) to be used on the  
>> format, and they very well may barf on a blank line.
>
> Well, they'll barf on blank lines and declare the header over;  
> changing that within the parser (or just restarting it on the rest  
> of the file) should be relatively cheap.

This assumes that people will be comfortable modifying libraries. IME  
people tend to treat them as magical black boxes that shouldn't be  
opened (or even questioned) under any circumstances...


> BTW, I notice that this draft is silent on the HTTP header syntax's  
> combining feature for multiple occurences of the same field (last  
> paragraph of 4.2, RFC 2616); I suspect that to be one of the more  
> likely causes for surprises if HTTP header parsers are re-used.  (No  
> such risk with MIME parsers.)

I'll add a note.


> Finally, why disallow whitespace stuffed folding?  It's pretty  
> useful to make long lines editable, and I suspect that we're  
> assuming /host-meta to be the product of some human with emacs in  
> their hands. ;-)  Implementing it is easy, and a given if existing  
> parsers are used.

Not necessarily; it's not very widely supported, IME.


>> So, the right thing to do might be to explicitly disallow them,  
>> both in BNF and prose. Eran, thoughts?
>
> I'd just prefer to not have the BNF say "no empty lines", and then  
> have prose that says the opposite, but with a SHOULD.
>
>>>> 5. Minting New meta-fields
>>>
>>>> Applications that wish to mint new meta-fields for use in the  
>>>> host- meta format MUST register them in the host-meta field- 
>>>> registry, following the procedures in Section 7.2. Field-names  
>>>> MUST conform to the field-name ABNF Section 3, and field-value  
>>>> syntax MUST be well- defined (e.g., using ABNF, or a reference to  
>>>> the syntax of an existing header field-value). Field-values  
>>>> SHOULD use the ISO-859-1 character encoding. If a field-value  
>>>> applies to a scope other than the entire authority, that scope  
>>>> MUST be well-defined.
>>>
>>> Editorial nit: ISO-8859-1 is missing an 8 here.
>>
>> That one always gets me, thanks.
>>
>>> More substantially, is there any particular reason to not just go  
>>> with utf-8 here?  After all, the content type is *appplication*/ 
>>> host-meta anyway.
>>
>> Same as above; allowing existing parsers and serialisation  
>> libraries to be used. That said, there have been many arguments in  
>> HTTPbis that existing libraries won't harm non-ASCII characters in  
>> transit, but IIRC no one has actually gone out and surveyed what  
>> they do...
>
> That suggests that it's a coin toss, unless the mythical "someone"  
> does that work.  May I, in that event, suggest that we use a coin  
> biased in favor of broader internationalization, i.e., UTF-8?

Well, the other side of the coin is interoperability, something that  
is also close to our collective hearts.

OTOH we're talking about a SHOULD here. Maybe it just needs more  
careful guidance; i.e., that you should stick to ASCII unless you're  
conveying elements for presentation to end users.


--
Mark Nottingham       mnot@yahoo-inc.com
Received on Wednesday, 11 February 2009 01:19:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 27 October 2010 18:14:30 GMT