- From: Graham Klyne <GK@ninebynine.org>
- Date: Thu, 19 Jul 2012 11:57:01 +0100
- To: public-rdf-comments@w3.org
With reference to http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/#section-IRIs And in particular to the note: [[ Previous versions of RDF used the term RDF URI Reference instead of IRI and allowed additional characters: <, >, {, }, |, \, ^, `, (double quote), and (space). In IRIs, these characters must be percent-encoded as described in section 2.1 of [URI]. ]] I have a concern that this change may lead to incompatibility with deployed software, and consequent failure of interoperability. Currently, the W3C RDF validator, Python rdflib and Jena libraries all allow and/or generate RDF with URIs that contain unescaped spaces (and presumably other characters). This note suggests that spaces (and other characters) must be %-escaped before being serialized into RDF 1.1, where current practice, as far as I can discern, is to assume that RDF carries character-encoded URIs that are not %-escaped. For the following, I assume that RDF 1.1 is intending to say that spaces MUST be %-escaped in URIs used as RDF node identifiers. Suppose I implement a service that accepts RDF from its clients. What is it to do about URIs containing unescaped spaces? If it rejects them as ill-formed, then it fails compatibility with existing clients that provide RDF 1.0 compatible data. If it applies %-escaping to non-URI-valid characters, this will result in double-escaping of RDF data from RDF 1.1 clients, something that RFC3986 says must be guarded against (http://tools.ietf.org/html/rfc3986#section-2.4), and may fail to recognize as equal URIs that should be equal provided by RDV 1.0 and RDF 1.1 clients. **** And a nit: the [IRI] reference in this document actually links to the URI spec. #g
Received on Thursday, 19 July 2012 10:59:47 UTC