Re: character encoding in RDF (including some new related issues)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Subject: Re: character encoding in RDF
Date: Thu, 6 Nov 2003 10:42:33 +0000

> It has been suggested off list that you might be satisfied with the editorial
> changes suggesed by Jeremy Carroll in
>   http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Nov/0006.html

I view these changes as a variation of the changes I suggested in my
initial message on this topic.  These changes do indeed capture the intent
of the situation, as opposed to the wording in the current document.

> If this is the case, and these provide an acceptable disposition of your
> comment, can you reply in the affirmative copying your message to
> www-rdf-comments@w3.org

These changes would indeed provide an acceptable disposition, provided that
they are made in all the appropriate places.  I identified Section 6.1.6,
6.1.7, 6.1.8, and 6.1.9 in my initial message; Jeremy only proposes three
changes, not including the one for blank node identifiers.  This difference
indicates that there should be another effort to identify all the places
where this sort of change needs to be made.

Upon further analysis, I note that the URI and string-value for attribute
events as well as the URI for element events can be placed directly in a
triple (as in Section 7.2.11) and so need a similar treatment.  Any grammar
action that has a <...> in it probably suffers from this problem.

However, the string-value of attribute events is used in the sections
above, so just making a variation of Jeremy's proposed change is
insufficient, as it would end up specifying double escaping.  My proposed
change would be somewhat better at avoiding double escaping, but it still
could be read as requiring double escaping.

Also, I believe that the treatment in the second actions of Section 7.2.11
and Section 7.2.21 are insufficient, as they neither check that the type
URI is in the form required of a URI in an RDF Graph nor do any escaping.
I expect that using a URI Reference Event as an intermediary would both
solve all of these problems as well as part of the problem above.

Further, the wording in 7.2.32 is rather suspect.  What does it mean for a
string to represent an RDF URI reference?  

I also worry about the details of espacing in URI references in RDF/XML.
My understanding is that URI references are supposed to be in escaped form,
and that downstream applications are not supposed to perform escaping,
except perhaps for the escaping for non-ASCII Unicode in IRIs.  I think
that RDF/XML takes a different and inconsistent stance on this, sometimes
allowing the escaping of certain ASCII characters when they appear in
RDF/XML.

To illustrate this point

	http://www.w3.org/foo{bar}

is not a legal URI (or IRI).  However, it is a legal RDF URI reference,
because it is a Unicode string that turns into a legal absolute URI with
optional fragment identifier when subject to the encoding in Section 6.4 of
RDF Concepts.

I note that various ``3.3 URI References'' pointers are to another document
and thus should probably be in a different form.  Besides which, the
relevant section (in RDF Tests) is mostly a pointer to another place, which
sould probably be referred to directly.

> Thanks
> 
> Dave

I await a revised, fully-worked-out proposal for the actual changes.

Peter F. Patel-Schneider
Bell Labs Research

Received on Thursday, 6 November 2003 08:59:16 UTC