Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

On 2016-12-14 10:59, Alexey Melnikov wrote:
> On 14/12/2016 09:21, Poul-Henning Kamp wrote:
>
>> --------
>> In message
>> <CACweHNDKgWQewZHb=Kz3_2=41M58sY5472Q5OwpqPLxorvkzHQ@mail.gmail.com>
>> , Matthew Kerwin writes:
>>
>>> If we're looking for inspiration elsewhere, why not C99?
>>>
>>>       "\" %x75 1*4HEXDIG
>>>     / "\" %x55 1*6HEXDIG  ; C99 accepts 1*8
>> The variable length means these run into the mess of "what if the
>> next character is a hex digit?".
>>
>> That's a much bigger issue when we're talking about code spitting
>> out strings than when it is programmers typing them in.
>>
>> I would prefer to make them fixed length:
>>
>>     "\" "u" 4HEXDIG
>>     "\" "U" 6HEXDIG
> IETF has published BCP 137, which should be followed, unless there is a
> very good reason not to:
>  https://www.rfc-editor.org/bcp/bcp137.txt
>
> See section 5.1.

Which says:

>    EmbeddedUnicodeChar =  %x5C.75.27 4*6HEXDIG %x27
>       ; starting with lowercase "\u" and "'" and ending with "'".
>       ; Note that the encodings are considered to be abstractions
>       ; for the relevant characters, not designations of specific
>       ; octets.
>
>    HEXDIG =  "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
>       "A" / "B" / "C" / "D" / "E" / "F"
>       ; effectively identical with definition in RFC 5234.

Has this ever been used in a protocol?

Best regards, Julian

Received on Wednesday, 14 December 2016 10:23:13 UTC