W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > April 2002

RE: charmod-uri

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Tue, 16 Apr 2002 15:33:53 +0100
To: "Martin Duerst" <duerst@w3.org>, <w3c-rdfcore-wg@w3.org>
Cc: <w3c-i18n-ig@w3.org>
Message-ID: <JAEBJCLMIFLKLOJGMELDIEKLCDAA.jjc@hplb.hpl.hp.com>

For RDF Core, a significant part of Martin's comments is the last paragraph
In RDF terms I read it as advocating that the labels of the nodes in the RDF
graph are US-ASCII URIs not IRIs (although implementations should maintain
the original character sequence).

So far I've heard:
- Jeremy, maybe Dan, maybe Larry, in favour of using "IRIs" (at least
original character sequences) as the labels on the RDF graph
- Martin, somehow Aaron as in favour of using US-ASCII URIs as the labels on
the RDF graph.

This is characerised by test 003 in


Denying test003 is agreeing that labels are US ASCII URIs.

Equivalently that is that




are completely equivalent and interchangeable. The only observable
difference in behaviour that may be expected is the exact form used by a
viewer or output device.

I am happy to go either way, although I find it difficult to see how to
state the normal form constraint that I think is important.

Consider the US ASCII URI


where %CC%81 is the UTF-8 encoding of character #x301 the combining acute

This is, like any US ASCII URI, in Normal Form C, despite the UTF-8 original
character sequence not being in NFC. I don't think I can support or propose
that we prohibit those US ASCII URIs that when viewed as a UTF-8 encoded
original characeter sequence correspond to original character sequences that
are not in NFC. I think a "NOTE: " somewhere about this would be close to
incomprehensible. Going down this path, in my view, limits us to saying
non-normative things about RDF platforms not using OCS's which are not NFC,
but using the corresponding US ASCII URI instead. I could write a
non-normative appendix to our syntax spec that said all this. We could
modify some of the fraud examples to show legal test cases that used the
US-ASCII form of the problematic IRIs.


> >[[[
> >2.3 Mapping of IRIs to URIs
> >
> >This section defines how to map an IRI to a URI. Everything in
> >this section applies also to IRI references and URI references, as
> >well as components thereoff (e.g. fragment identifiers).
> >
> >This mapping has two purposes:
> >
> >   a) Syntactical: Many URI schemes and components define additional
> >      syntactical restrictions not captured in Section 2.2. Such
> >      restrictions can be applied to IRIs by noting that IRIs are only
> >      valid if they map to syntactically valid URIs. This means that
> >      such syntactical restrictions do not have to be defined again
> >      on the IRI level.
> >
> >   b) Interpretational: URIs identify resources in various ways. IRIs
> >      also indentify resources. The resource that an IRI identifies is
> >      the same as the one identified by the URI obtained after
> >      converting the IRI according to the procedure defined here.
> >      This means that there is no need to define the association
> >      between identifier and resource again on the IRI level.
> >]]]
> >
> >This seems to suggest that we should do the mapping before the
> model theory;
> >which is in tension with the usual refusal to normalize URIs for scheme
> >case, hostname case, port number, missing default path, or anything else,
> >except as part of actually executing the protocol.

> Clearing up escape issues is one step before casing issues.
> Most escape issues (for a-zA-Z0-9, everything outside US-ASCII,
> plus a few specials) are completely independent of the scheme,
> they apply to all URIs. Case and the other stuff is very much
> scheme-dependent. This is a big difference.

> >It is potentially self-inconsistent with the phrase:
> >
> >[[[
> >However, this mapping SHOULD only be applied when necessary, as late
> >as possible.
> >]]]

> No, it is not. For RDF, it would just mean that when you compare,
> you may want to apply it, but you wouldn't convert and stay there;
> you would keep the original.
Received on Tuesday, 16 April 2002 10:35:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:12 UTC