RE: [P&C] utf8-char in Zip-rel-path from Marcin Hanclik on 2009-07-28 (public-webapps@w3.org from July to September 2009)

From: Marcin Hanclik <Marcin.Hanclik@access-company.com>
Date: Tue, 28 Jul 2009 12:04:36 +0200
To: "marcosc@opera.com" <marcosc@opera.com>
CC: "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <FAA1D89C5BAF1142A74AF116630A9F2C2890AF8567@OBEEX01.obe.access-company.com>

Hi Marcos,

>>Unless it's broken (?), I would prefer to leave it as is.
I think that P&C should either change the grammar (as suggested in my email) or specify that the zip-rel-path operates on characters.
The first would be well adjusted to Zip filename encoding in UTF-8 as a kind of natural match, whereas the latter could be clearer wrt the characters that can be used in the path.

Anyway, I think some clarification - e.g. on of the above - is required.

Thanks.

Kind regards,
Marcin

Marcin Hanclik
ACCESS Systems Germany GmbH
Tel: +49-208-8290-6452  |  Fax: +49-208-8290-6465
Mobile: +49-163-8290-646
E-Mail: marcin.hanclik@access-company.com

-----Original Message-----
From: marcosscaceres@gmail.com [mailto:marcosscaceres@gmail.com] On Behalf Of Marcos Caceres
Sent: Monday, July 27, 2009 6:22 PM
To: Marcin Hanclik
Cc: public-webapps@w3.org
Subject: Re: [P&C] utf8-char in Zip-rel-path

On Sun, Jul 26, 2009 at 11:29 PM, Marcin
Hanclik<Marcin.Hanclik@access-company.com> wrote:
> Hi,
>
> Given the fact that
>
> "Rule names are case insensitive."
> http://tools.ietf.org/html/rfc5234#section-2.1

>
> it could potentially be better to rename the rule from "utf8-char" to something else, since it may get confused with "UTF8-char" rule from http://tools.ietf.org/html/rfc3629#section-4.

>
> Taken into account my comments in the mail below, we could have new rule replacing utf8-char:
> zip-UTF8-char   = UTF8-2 / UTF8-3 / UTF8-4
> UTF8-2      = %xC2-DF UTF8-tail
> UTF8-3      = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) /
>                 %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail )
> UTF8-4      = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) /
>                 %xF4 %x80-8F 2( UTF8-tail )
> UTF8-tail   = %x80-BF
>
> The problem may be with the allowed ranges of the Unicode characters.
> The above grammar seems to allow 0080-10FFFF (the UTF-16 accessible range minus characters < 0080)
> http://tools.ietf.org/html/rfc3629#section-3

> whereas the current utf8-char rule is more selective.
>

Unless it's broken (?), I would prefer to leave it as is.

--
Marcos Caceres
http://datadriven.com.au


________________________________________

Access Systems Germany GmbH
Essener Strasse 5  |  D-46047 Oberhausen
HRB 13548 Amtsgericht Duisburg
Geschaeftsfuehrer: Michel Piquemal, Tomonori Watanabe, Yusuke Kanda

www.access-company.com

CONFIDENTIALITY NOTICE
This e-mail and any attachments hereto may contain information that is privileged or confidential, and is intended for use only by the
individual or entity to which it is addressed. Any disclosure, copying or distribution of the information by anyone else is strictly prohibited.
If you have received this document in error, please notify us promptly by responding to this e-mail. Thank you.

Received on Tuesday, 28 July 2009 10:05:45 UTC