On Sun, Jul 26, 2009 at 11:29 PM, Marcin Hanclik<Marcin.Hanclik@access-company.com> wrote: > Hi, > > Given the fact that > > "Rule names are case insensitive." > http://tools.ietf.org/html/rfc5234#section-2.1 > > it could potentially be better to rename the rule from "utf8-char" to something else, since it may get confused with "UTF8-char" rule from http://tools.ietf.org/html/rfc3629#section-4. > > Taken into account my comments in the mail below, we could have new rule replacing utf8-char: > zip-UTF8-char = UTF8-2 / UTF8-3 / UTF8-4 > UTF8-2 = %xC2-DF UTF8-tail > UTF8-3 = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) / > %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail ) > UTF8-4 = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) / > %xF4 %x80-8F 2( UTF8-tail ) > UTF8-tail = %x80-BF > > The problem may be with the allowed ranges of the Unicode characters. > The above grammar seems to allow 0080-10FFFF (the UTF-16 accessible range minus characters < 0080) > http://tools.ietf.org/html/rfc3629#section-3 > whereas the current utf8-char rule is more selective. > Unless it's broken (?), I would prefer to leave it as is. -- Marcos Caceres http://datadriven.com.auReceived on Monday, 27 July 2009 16:23:10 GMT
This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:33 GMT