- From: Alexey Melnikov <alexey.melnikov@isode.com>
- Date: Wed, 14 Dec 2016 10:38:28 +0000
- To: Julian Reschke <julian.reschke@gmx.de>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Matthew Kerwin <matthew@kerwin.net.au>
- Cc: Kari Hurtta <hurtta-ietf@elmme-mailer.org>, Ilari Liusvaara <ilariliusvaara@welho.com>, HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
On 14/12/2016 10:21, Julian Reschke wrote: > On 2016-12-14 10:59, Alexey Melnikov wrote: >> On 14/12/2016 09:21, Poul-Henning Kamp wrote: >> >>> -------- >>> In message >>> <CACweHNDKgWQewZHb=Kz3_2=41M58sY5472Q5OwpqPLxorvkzHQ@mail.gmail.com> >>> , Matthew Kerwin writes: >>> >>>> If we're looking for inspiration elsewhere, why not C99? >>>> >>>> "\" %x75 1*4HEXDIG >>>> / "\" %x55 1*6HEXDIG ; C99 accepts 1*8 >>> The variable length means these run into the mess of "what if the >>> next character is a hex digit?". >>> >>> That's a much bigger issue when we're talking about code spitting >>> out strings than when it is programmers typing them in. >>> >>> I would prefer to make them fixed length: >>> >>> "\" "u" 4HEXDIG >>> "\" "U" 6HEXDIG >> IETF has published BCP 137, which should be followed, unless there is a >> very good reason not to: >> https://www.rfc-editor.org/bcp/bcp137.txt >> >> See section 5.1. > > Which says: > >> EmbeddedUnicodeChar = %x5C.75.27 4*6HEXDIG %x27 >> ; starting with lowercase "\u" and "'" and ending with "'". >> ; Note that the encodings are considered to be abstractions >> ; for the relevant characters, not designations of specific >> ; octets. >> >> HEXDIG = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" / >> "A" / "B" / "C" / "D" / "E" / "F" >> ; effectively identical with definition in RFC 5234. > > Has this ever been used in a protocol? Some: https://datatracker.ietf.org/doc/rfc5137/referencedby/ This was also extensively used in other RFCs without referencing the BCP.
Received on Wednesday, 14 December 2016 10:39:17 UTC