Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

Sorry to be late, cleanup during the holidays.

On 2016/12/15 10:57, Matthew Kerwin wrote:

> ​I have should noted here that Ruby uses this \u{N...} syntax, including
> the lower limit of one hexadecimal digit.  This is a valid string literal
> in Ruby:
>
> "\u{df}\u{9}\u{1f602}"​

Not only that, but Ruby allows \uABCD in case there are exactly 4 hex 
digits. Also, you can write the above as \u{df 9 1f602}, too. Ruby puts 
writers' and readers' convenience above other concerns, but this doesn't 
mean that we can't use it.


> ​There is precedent, although I'm not sure if it's a good precedent: the
> "content" attribute in CSS uses:
>
>     %5c 1*6HEXDIGIT
>
> ...which is both undelimited (which I oppose) and without an explicit
> hexadecimal indicator (about which I'm mostly ambivalent.)​

Yes. That lead to some of the stuff in 
https://www.w3.org/TR/charmod/#sec-Escaping, in particular 
https://www.w3.org/TR/charmod/#C044.


As for the \u'ABCD' recommendation in 
https://tools.ietf.org/html/rfc5137#section-5.1:

On 2016/12/14 19:38, Alexey Melnikov wrote:
 > On 14/12/2016 10:21, Julian Reschke wrote:

 >> Has this ever been used in a protocol?

I think this is a very good question. RFC 5137 doesn't even give a full 
example of its very own notation. Also, I don't think \u'ABCD' existed 
before RFC 5137. It smells quite a bit of https://xkcd.com/927/ (but I 
may be wrong, and of course, this area is prone for such phenomena).

 > Some:
 > https://datatracker.ietf.org/doc/rfc5137/referencedby/

That record is very sparse.

 > This was also extensively used in other RFCs without referencing the BCP.

Pointers, please.


Regards,   Martin.

Received on Wednesday, 4 January 2017 02:53:14 UTC