Re: RDF IDs, N-Triples URI encoding (was Re: URI terminology demystified)

Dave Beckett wrote:
> 
> >>>Jeremy Carroll said:
> 
> <snip/>
> > I also note that this is consistent with our test case:
> >
> > http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.nt
> >
> > http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.rdf
> >
> > which has not been approved, seems to suggest the following
> >
> > 1: ID's are subject to the same URI encoding rule.
> 
> Note that at present, RDF ID attributes are not XML IDs.

I don't understand how that's relevant. We decided
that
	rdf:ID="foo"
is short for
	rdf:about="#foo"
and hence rdf:ID *is* subject to the same rules for
turning Unicode strings into URI references.

>  RDF M&S
> uses a production from XML's original syntax, but does not say that
> the IDs have the same meaning as XML.  There are also some words I
> think about ID creating a resource in M&S - which I think has been
> discussed previously (can't find them just now).
> 
> I noted this as an issue near
>   http://www.w3.org/TR/2001/WD-rdf-syntax-grammar-20010906/#rdf-id
> 
> > 2: N-triple URIs are in US-ASCII and must be already encoded.
> 
> The encoding of URIs in N-Triples must change to match the Charmod
> requirements,

No... to do that would be to propagate a misunderstanding.

n-triples has the distinctive feature that URI references go right
in the file without any sort of quoting or other mangling,
except putting <>'s around them. (relative URI references do get
absolutized.) Recall: URI references consist of US-ASCII characters
only. No URI reference has an umlaut in it.

string literals in n-triples need quoting, but URI references
do not.

So n-triples shows the *result* of taking a unicode string
from an rdf:resource attribute and converting it to a URI reference.



> see the discussion on www-rdf-comments starting at
>   http://lists.w3.org/Archives/Public/www-rdf-comments/2001JulSep/0245.html
> and subsequent responses by me.

I see:

|Looking; CHARMOD says, for URIs:
|
|  A W3C specification that defines new syntax for URIs, such as a new
|  kind of fragment identifier, MUST specify that characters outside
|  the US-ASCII repertoire are encoded in URIs using UTF-8 and
|  %HH-escaping
|  -- http://www.w3.org/TR/charmod/#sec-URIs
	-- DaveB, Tue, 18 Sep 2001 10:38:59 +0100

That text from the CHARMOD spec is perhaps misleading: no W3C
specification can define new syntax for URIs. That's what
RFC2396 is for. W3C specs can be defined that *use* URIs.
n-triples is such a format. Luckily, as I say, URIs in
n-triples don't need any form of quoting.


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Thursday, 20 September 2001 10:00:09 UTC