Re: host-meta file format comments (draft-nottingham-site-meta-01)

(diverting to www-talk, too...)

On 11 Feb 2009, at 01:20, Mark Nottingham wrote:

> Yeah, I'm not completely happy with it yet. The thought was that  
> since blank lines don't introduce ambiguity here, they're not  
> harmful. OTOH one of my goals for the format is to allow existing  
> HTTP header and MIME parsers (e.g., in Python) to be used on the  
> format, and they very well may barf on a blank line.

Well, they'll barf on blank lines and declare the header over;  
changing that within the parser (or just restarting it on the rest of  
the file) should be relatively cheap.

BTW, I notice that this draft is silent on the HTTP header syntax's  
combining feature for multiple occurences of the same field (last  
paragraph of 4.2, RFC 2616); I suspect that to be one of the more  
likely causes for surprises if HTTP header parsers are re-used.  (No  
such risk with MIME parsers.)

Finally, why disallow whitespace stuffed folding?  It's pretty useful  
to make long lines editable, and I suspect that we're assuming /host- 
meta to be the product of some human with emacs in their hands. ;-)   
Implementing it is easy, and a given if existing parsers are used.

> So, the right thing to do might be to explicitly disallow them, both  
> in BNF and prose. Eran, thoughts?

I'd just prefer to not have the BNF say "no empty lines", and then  
have prose that says the opposite, but with a SHOULD.

>>> 5. Minting New meta-fields
>>
>>> Applications that wish to mint new meta-fields for use in the  
>>> host- meta format MUST register them in the host-meta field- 
>>> registry, following the procedures in Section 7.2. Field-names  
>>> MUST conform to the field-name ABNF Section 3, and field-value  
>>> syntax MUST be well- defined (e.g., using ABNF, or a reference to  
>>> the syntax of an existing header field-value). Field-values SHOULD  
>>> use the ISO-859-1 character encoding. If a field-value applies to  
>>> a scope other than the entire authority, that scope MUST be well- 
>>> defined.
>>
>> Editorial nit: ISO-8859-1 is missing an 8 here.
>
> That one always gets me, thanks.
>
>> More substantially, is there any particular reason to not just go  
>> with utf-8 here?  After all, the content type is *appplication*/ 
>> host-meta anyway.
>
> Same as above; allowing existing parsers and serialisation libraries  
> to be used. That said, there have been many arguments in HTTPbis  
> that existing libraries won't harm non-ASCII characters in transit,  
> but IIRC no one has actually gone out and surveyed what they do...

That suggests that it's a coin toss, unless the mythical "someone"  
does that work.  May I, in that event, suggest that we use a coin  
biased in favor of broader internationalization, i.e., UTF-8?

Received on Wednesday, 11 February 2009 01:12:16 UTC