Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

On 14/12/2016 10:21, Julian Reschke wrote:

> On 2016-12-14 10:59, Alexey Melnikov wrote:
>> On 14/12/2016 09:21, Poul-Henning Kamp wrote:
>>
>>> --------
>>> In message
>>> <CACweHNDKgWQewZHb=Kz3_2=41M58sY5472Q5OwpqPLxorvkzHQ@mail.gmail.com>
>>> , Matthew Kerwin writes:
>>>
>>>> If we're looking for inspiration elsewhere, why not C99?
>>>>
>>>>       "\" %x75 1*4HEXDIG
>>>>     / "\" %x55 1*6HEXDIG  ; C99 accepts 1*8
>>> The variable length means these run into the mess of "what if the
>>> next character is a hex digit?".
>>>
>>> That's a much bigger issue when we're talking about code spitting
>>> out strings than when it is programmers typing them in.
>>>
>>> I would prefer to make them fixed length:
>>>
>>>     "\" "u" 4HEXDIG
>>>     "\" "U" 6HEXDIG
>> IETF has published BCP 137, which should be followed, unless there is a
>> very good reason not to:
>>  https://www.rfc-editor.org/bcp/bcp137.txt
>>
>> See section 5.1.
>
> Which says:
>
>>    EmbeddedUnicodeChar =  %x5C.75.27 4*6HEXDIG %x27
>>       ; starting with lowercase "\u" and "'" and ending with "'".
>>       ; Note that the encodings are considered to be abstractions
>>       ; for the relevant characters, not designations of specific
>>       ; octets.
>>
>>    HEXDIG =  "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
>>       "A" / "B" / "C" / "D" / "E" / "F"
>>       ; effectively identical with definition in RFC 5234.
>
> Has this ever been used in a protocol?
Some:
https://datatracker.ietf.org/doc/rfc5137/referencedby/

This was also extensively used in other RFCs without referencing the BCP.

Received on Wednesday, 14 December 2016 10:39:17 UTC