Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

Matthew Kerwin <>: (Wed Dec 14 13:53:45 2016)
> It says that "forms that use explicit string delimiters are generally
> preferred over other alternatives. In many contexts, symmetric paired
> delimiters are easier to recognize and understand than visually unrelated
> ones." So brackets are good.
> And while it advises against using Perl's \x{NNNN...} syntax (because of
> potential ambiguities with two-digit hex codes), it doesn't say anything at
> all about \u{N...}
> Curly braces cost 14+15 bits in HPACK, parentheses 10+10 (incidentally
> cheaper than single quotes, which are 11+11). It's also convenient that
> little 'u' is one bit cheaper than little 'x'.
> I don't think parentheses are at too much risk of needing escaping, so it
> seems like the solution that goes with BCP 137, and compresses alright with
> HPACK, is:
>     %x5c.75.28 1*6HEXDIGIT %x29
> It's still a little bit clunky for things like "Stra\u(df)e", but not so
> bad for emoji "\u(1f602)" and somewhere in between for Hiragana "
> \u(3053)\u(3093)\u(306b)\u(3064)".

I think that this is best suggestion so far.

But can this also be shorter ?

     %x5c.28 1*6HEXDIGIT %x29



{ Yes, it is not visible that this is hexadecimal. }


 EmbeddedUnicodeChar =  %x5C.75.27 4*6HEXDIG %x27

works for me.
> Cheers​
> > Best regards, Julian
> >
> > PS: and, as a nit, it's strange that the syntax uses delimiters but
> > doesn't allow sequences of 1 to 3 HEXDIGs...
> >
> >
> ​Having just written "\u(df)" I kind of understand; it really feels like
> I'm describing an octet rather than a codepoint. I don't think there's a
> *technical* reason, though.  


>                              Is it alright to see "\u(9)" or an equivalent
> in text?

 Or is that "\(9)" alright if 'u' is also dropped.

If that wanted to be avoid, that means

 %x5c.75.28 3*6HEXDIGIT %x29


 %x5c.28 3*6HEXDIGIT %x29

on my newest suggestion.

> -- 
>   Matthew Kerwin

/ Kari Hurtta

Received on Wednesday, 14 December 2016 17:45:01 UTC