- From: Marcos Caceres <marcosc@opera.com>
- Date: Wed, 16 Sep 2009 16:08:27 +0200
- To: Marcin Hanclik <Marcin.Hanclik@access-company.com>
- Cc: Robin Berjon <robin@berjon.com>, public-webapps WG <public-webapps@w3.org>
On Wed, Sep 16, 2009 at 12:32 PM, Marcin Hanclik <Marcin.Hanclik@access-company.com> wrote: > Hi Marcos, > >>>So it turns out that %-encoded really just means "replace this '%xx' >>>with UTF-8 bytes". > Yes. > >>>So we don't need to do anything. > P&C shall state the actual algorithm and equivalence. > > http://www.w3.org/TR/2009/WD-widgets-apis-20090423/ > had this issue: > "ISSUE: do we need to do some kind of URI normalization to check for equivalency?" > > According to RFC3987, 5.1: > " Applications using IRIs as identity tokens with no relationship to a > protocol MUST use the Simple String Comparison (see section 5.3.1). > All other applications MUST select one of the comparison practices > from the Comparison Ladder (see section 5.3 or, after IRI-to-URI > conversion, select one of the comparison practices from the URI > comparison ladder in [RFC3986], section 6.2)" > > @href may fall into Comparison Ladder case, id into namespaces. > The question (still the same) is whether in case of @name of <feature> the IRIs are used as identity tokens (id, simple string) or anything else/new. > They are namespaces. I actually raised this issue a long time ago too because I had the same concerns as you. The WG decided that strings that name things (@id, @name) are treated as namespaces. > Once the answer is that IRIs are to be treated as identity tokens (as you propose and I agree), then we still have the issue of expressing the non-ASCII IRIs in ASCII documents (border case). Then we would need a guideline / example that in XML the author shall use character entities to encode the IRI (I marked this solution awkward, but I could live with it). > I think Addison already said this was not a problem: if you know the encoding of the XML document, you know the encoding of the URI. URI are always treated as UTF-8 internally. There is no problem here. -- Marcos Caceres http://datadriven.com.au
Received on Wednesday, 16 September 2009 14:09:23 UTC