RE: IRIs, IDNAbis, and HTTP

At 08:08 08/03/14, Brian Smith wrote:
>
>Julian Reschke wrote:
>> Brian Smith wrote:
>> > Thank you. I read the BNF for TEXT many times but I've always 
>> > overlooked the accompanying text. The RFC 2047 mechanism is truly 
>> > horrible

Very well put.

>> > but I guess it satisfies the requirements.
>
>> If it works in practice.

And, if it is actually used in practice. Any pointers to some
actual usage (both iso-8859-1 and RFC 2047) would be appreciated.


>> BTW: "Words of *TEXT MAY contain characters from
>> character sets other than ISO-8859-1 [22] only when
>> encoded according to the rules of RFC 2047 [14]."
>> really is incorrect and should be rephrased.
>
>It is not clear whether or not the RFC 2047 mechanism can be used in
>quoted-string, because quoted-string is not defined in terms of "*TEXT",
>but rather a similar construct.

My reading the last time I read the spec (years back) is that
Warning: is about the only place where it is allowed.

It may be difficult to fix the truely horrible RFC 2047 on top of
iso-8859-1 mess for existing headers. But in order to move in the
right direction, it would be a very good idea to allow newly defined
headers to specify that they just use UTF-8.

Regards,    Martin.

>Given all the places that quoted-string
>is used, should the RFC 2047 mechanism really be allowed in all these
>places?:
>
>       Accept
>       Cache-Control
>       Content-Encoding
>       Content-Type
>       ETag
>       Expect
>       If-Match
>       If-None-Match
>       If-Range
>       Pragma
>       TE
>       Transfer-Encoding       
>       Warning
>
>Also, the Reason-phrase of the status line is defined as:
>
>       *<TEXT, excluding CR, LF>
>
>But, is the RFC 2047 mechanism allowed in the Reason-phrase?
>
>If the RFC 2047 mechanism is not allowed in quoted-string or the
>Reason-phrase, then were *is* it allowed?
>
>> > What is an accurate BNF grammar for TEXT? It is not clear 
>> > to me how I am supposed to parse a quoted-string that
>> > contains "=?" but which is not a valid encoded-word.
>
>> Good point. Are recipients of TEXT-typed header contents 
>> supposed to always run the value through an RFC2047 parser?
>
>Are there any headers fields that have *TEXT in their grammar?
>
>If the specification is read strictly, then the RFC 2047 mechanism has
>never been allowed everywhere. And, if it is read liberally, then it is
>allowed in way too many places. And, if it is allowed anywhere, there
>should be some advice as to what encodings should be supported.
>
>- Brian


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Friday, 14 March 2008 06:21:18 UTC