- From: Marcos Caceres <marcosc@opera.com>
- Date: Mon, 27 Jul 2009 18:22:10 +0200
- To: Marcin Hanclik <Marcin.Hanclik@access-company.com>
- Cc: "public-webapps@w3.org" <public-webapps@w3.org>
On Sun, Jul 26, 2009 at 11:29 PM, Marcin Hanclik<Marcin.Hanclik@access-company.com> wrote: > Hi, > > Given the fact that > > "Rule names are case insensitive." > http://tools.ietf.org/html/rfc5234#section-2.1 > > it could potentially be better to rename the rule from "utf8-char" to something else, since it may get confused with "UTF8-char" rule from http://tools.ietf.org/html/rfc3629#section-4. > > Taken into account my comments in the mail below, we could have new rule replacing utf8-char: > zip-UTF8-char = UTF8-2 / UTF8-3 / UTF8-4 > UTF8-2 = %xC2-DF UTF8-tail > UTF8-3 = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) / > %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail ) > UTF8-4 = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) / > %xF4 %x80-8F 2( UTF8-tail ) > UTF8-tail = %x80-BF > > The problem may be with the allowed ranges of the Unicode characters. > The above grammar seems to allow 0080-10FFFF (the UTF-16 accessible range minus characters < 0080) > http://tools.ietf.org/html/rfc3629#section-3 > whereas the current utf8-char rule is more selective. > Unless it's broken (?), I would prefer to leave it as is. -- Marcos Caceres http://datadriven.com.au
Received on Monday, 27 July 2009 16:23:10 UTC