W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2009

Re: Normalization, was: RE: [Widget URI] Internationalization, widget IRI?

From: Marcos Caceres <marcosc@opera.com>
Date: Wed, 16 Sep 2009 16:17:09 +0200
Message-ID: <b21a10670909160717me2396feoc9d90a9ac67a3287@mail.gmail.com>
To: Marcin Hanclik <Marcin.Hanclik@access-company.com>
Cc: Robin Berjon <robin@berjon.com>, public-webapps WG <public-webapps@w3.org>
Lets try this another way. Can you make me a widget that explicitly
demos the problem? I will then run it against our implementation and
see what happens.

I will also add the widget to the test suite, to make sure we expose
and potential misunderstandings in the spec wrt URIs.

Kind regards,

On Wed, Sep 16, 2009 at 4:08 PM, Marcos Caceres <marcosc@opera.com> wrote:
> On Wed, Sep 16, 2009 at 12:32 PM, Marcin Hanclik
> <Marcin.Hanclik@access-company.com> wrote:
>> Hi Marcos,
>>>>So it turns out that %-encoded really just means "replace this '%xx'
>>>>with UTF-8 bytes".
>> Yes.
>>>>So we don't need to do anything.
>> P&C shall state the actual algorithm and equivalence.
>> http://www.w3.org/TR/2009/WD-widgets-apis-20090423/
>> had this issue:
>> "ISSUE: do we need to do some kind of URI normalization to check for equivalency?"
>> According to RFC3987, 5.1:
>> "  Applications using IRIs as identity tokens with no relationship to a
>>   protocol MUST use the Simple String Comparison (see section 5.3.1).
>>   All other applications MUST select one of the comparison practices
>>   from the Comparison Ladder (see section 5.3 or, after IRI-to-URI
>>   conversion, select one of the comparison practices from the URI
>>   comparison ladder in [RFC3986], section 6.2)"
>> @href may fall into Comparison Ladder case, id into namespaces.
>> The question (still the same) is whether in case of @name of <feature> the IRIs are used as identity tokens (id, simple string) or anything else/new.
> They are namespaces. I actually raised this issue a long time ago too
> because I had the same concerns as you. The WG decided that strings
> that name things (@id, @name) are treated as namespaces.
>> Once the answer is that IRIs are to be treated as identity tokens (as you propose and I agree), then we still have the issue of expressing the non-ASCII IRIs in ASCII documents (border case). Then we would need a guideline / example that in XML the author shall use character entities to encode the IRI (I marked this solution awkward, but I could live with it).
> I think Addison already said this was not a problem: if you know the
> encoding of the XML document, you know the encoding of the URI. URI
> are always treated as UTF-8 internally. There is no problem here.
> --
> Marcos Caceres
> http://datadriven.com.au

Marcos Caceres
Received on Wednesday, 16 September 2009 14:18:11 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 October 2017 07:26:18 UTC