RE: IRIs, IDNAbis, and HTTP

Julian Reschke wrote:
> Brian Smith wrote:
> > Thank you. I read the BNF for TEXT many times but I've always 
> > overlooked the accompanying text. The RFC 2047 mechanism is truly 
> > horrible but I guess it satisfies the requirements.

> If it works in practice.

> BTW: "Words of *TEXT MAY contain characters from
> character sets other than ISO-8859-1 [22] only when
> encoded according to the rules of RFC 2047 [14]."
> really is incorrect and should be rephrased.

It is not clear whether or not the RFC 2047 mechanism can be used in
quoted-string, because quoted-string is not defined in terms of "*TEXT",
but rather a similar construct. Given all the places that quoted-string
is used, should the RFC 2047 mechanism really be allowed in all these
places?:

	Accept
	Cache-Control
	Content-Encoding
	Content-Type
	ETag
	Expect
	If-Match
	If-None-Match
	If-Range
	Pragma
	TE
	Transfer-Encoding	
	Warning

Also, the Reason-phrase of the status line is defined as:

	*<TEXT, excluding CR, LF>

But, is the RFC 2047 mechanism allowed in the Reason-phrase?

If the RFC 2047 mechanism is not allowed in quoted-string or the
Reason-phrase, then were *is* it allowed?

> > What is an accurate BNF grammar for TEXT? It is not clear 
> > to me how I am supposed to parse a quoted-string that
> > contains "=?" but which is not a valid encoded-word.

> Good point. Are recipients of TEXT-typed header contents 
> supposed to always run the value through an RFC2047 parser?

Are there any headers fields that have *TEXT in their grammar?

If the specification is read strictly, then the RFC 2047 mechanism has
never been allowed everywhere. And, if it is read liberally, then it is
allowed in way too many places. And, if it is allowed anywhere, there
should be some advice as to what encodings should be supported.

- Brian

Received on Thursday, 13 March 2008 23:09:08 UTC