- From: Jonathan Rees <jar@creativecommons.org>
- Date: Wed, 27 Oct 2010 12:49:30 -0400
- To: www-tag@w3.org
ACTION-487 Assessing the impact of proposed changes to IRIs on RDF and OWL. Executive summary: The RDF specs are protected against changes to IRIs because they refer to "[IRI draft] or its successors". The OWL specs and RDFa are not protected since they refer normatively to RFC 3987. I can't say whether any applications will be affected. (To non-TAG readers of the list, this action refers to TAG discussion on IRIs last week, in which Larry outlined changes that are being considered. You can consult the TAG F2F minutes when they come out, and Larry will keep us posted on developments and drafts. This email is completely independent of the details of the changes.) ------------------------------------------------------------ Details RDF has an abstract syntax ('graphs') and a variety of serializations, including RDF/XML, Turtle, and RDFa. The RDF abstract syntax treatment of IRIs is here: http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref (2004) Nodes in a graph can be "RDF URI references". "This section anticipates an RFC on Internationalized Resource Identifiers. Implementations may issue warnings concerning the use of RDF URI References that do not conform with [IRI draft] or its successors." This is sort of OK, since "its successors" would include the future IRI specification. However, conradicting this somewhat, it says that a Unicode string is an RDF URI Reference iff it has no control characters AND the ASCII string obtained by converting to UTF-8 and then percent-encoding is a valid URI. (I'm glossing, see the spec for details.) There is a reference to 'XML Schema Part 2: Datatypes' (May 2001), which in turn defers to 'XML Linking Language', which repeats the definition of validity based on what would happen according to the UTF-8 / %-encoding process. There is also a reference to 'Namespaces in XML 1.1', which says pretty much the same thing. As the references to Schema and Namespaces are from 'notes' one might think they are non-normative, but they're in a normative section of the document and the references are listed as normative. Comparison of 'RDF URI References' is by string comparison, not URI equivalence. There is no specified conversion of RDF URI References to URIs, because none is needed. So all that matters (from the POV of the specs) is which strings are valid IRIs, not what happens when they get converted to URIs. Thus the only RDF-related failure induced by a change to IRIs would be in causing a formerly valid RDF graph to be invalid or vice versa. Of course there are applications that convert IRIs to URIs, and they would be affected, but this would have nothing to do with their RDF conformance. 'RDF Semantics' http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#urisandlit talks about 'URI references' without saying what they are. There is normative reference to RDF 2396 for other reasons, but I find it hard to imagine that any use of this recommendation would be affected by changes to the syntax of URIs (or IRIs), as it's not the job of the document to specify the syntax of anything. RDF/XML just refers to 'RDF URI References' from the Concepts document. Turtle ( http://www.w3.org/TeamSubmission/turtle/ ) is vague on the subject. It has a normative reference to RFC 3987 but this is not part of defining what its 'URI references' mean - indeed it doesn't say what 'URI references' are, syntactically. I would guess that any reasonable person would go to RDF Concepts to get the definition, although taking them to be RFC 3986 URI references would also be forgiveable. Similarly, 'RDFa in XHTML' does not define 'URI reference' and the reasonable assumption would be that these are inherited from RDF Concepts or 3986. Unfortunately RDFa has a definition of CURIEs that normatively references RFC 3987. SPARQL IRIs are defined by normative reference to RDF Concepts. OWL 2 cites RFC 3987 normatively in defining what an IRI is. See http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/#IRIs That will obviously be a problem if 3987 gets replaced. OWL speaks very abstractly of accessing ontology documents: http://www.w3.org/TR/owl2-syntax/#Ontology_Documents "Each ontology document can be accessed via an IRI by means of an appropriate protocol." This leaves the means of access up to each application involved in implementing OWL. The 'appropriate protocol' might not even involve URIs at all. The OWL documents refer to 'XML Base' (2008) http://www.w3.org/TR/xmlbase/ a few times, which references LEIRI, not RFC 3987, but not in a way that would cause LEIRI to apply to OWL. ------------------ So what does this all mean? 1. There are the obvious annoyances around normative references to documents that are really part of a time series. Henry has figured out one way this might be addressed, and that technique should be applied to the OWL and RDF recommendations at the next opportunity. (Someone will provide the reference to Henry's policy I'm sure...) 2. An "old" IRI might be rejected or misinterpreted by an upgraded application. 3. A "new" IRI might be rejected or misinterpreted by an "old" application. 2 and 3 seem highly unlikely, but the question is difficult to answer, One reason for this is that the new draft hasn't been written, so we don't know exactly what the changes will be. (I'm confident that the authors of the new IRI draft will explain to us exactly what they are, when the time comes.) The other is that as a member of the ASCII-speaking world, I (as most members of this list) would not be aware of how non-ASCII IRIs are being deployed in RDF and OWL. Jonathan
Received on Wednesday, 27 October 2010 16:50:08 UTC