Re: [P&C] utf8-char in Zip-rel-path from Marcos Caceres on 2009-07-27 (public-webapps@w3.org from July to September 2009)

From: Marcos Caceres <marcosc@opera.com>
Date: Mon, 27 Jul 2009 18:22:10 +0200
To: Marcin Hanclik <Marcin.Hanclik@access-company.com>
Cc: "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <b21a10670907270922k10dcf719u2643e7023f7e549b@mail.gmail.com>

On Sun, Jul 26, 2009 at 11:29 PM, Marcin
Hanclik<Marcin.Hanclik@access-company.com> wrote:
> Hi,
>
> Given the fact that
>
> "Rule names are case insensitive."
> http://tools.ietf.org/html/rfc5234#section-2.1
>
> it could potentially be better to rename the rule from "utf8-char" to something else, since it may get confused with "UTF8-char" rule from http://tools.ietf.org/html/rfc3629#section-4.
>
> Taken into account my comments in the mail below, we could have new rule replacing utf8-char:
> zip-UTF8-char   = UTF8-2 / UTF8-3 / UTF8-4
> UTF8-2      = %xC2-DF UTF8-tail
> UTF8-3      = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) /
>                 %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail )
> UTF8-4      = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) /
>                 %xF4 %x80-8F 2( UTF8-tail )
> UTF8-tail   = %x80-BF
>
> The problem may be with the allowed ranges of the Unicode characters.
> The above grammar seems to allow 0080-10FFFF (the UTF-16 accessible range minus characters < 0080)
> http://tools.ietf.org/html/rfc3629#section-3
> whereas the current utf8-char rule is more selective.
>

Unless it's broken (?), I would prefer to leave it as is.

-- 
Marcos Caceres
http://datadriven.com.au

Received on Monday, 27 July 2009 16:23:10 UTC