- From: David Booth <david@dbooth.org>
- Date: Thu, 19 Jul 2012 09:52:30 -0400
- To: Graham Klyne <GK@ninebynine.org>
- Cc: public-rdf-comments@w3.org
Graham, I'm confused about this, because to my knowledge, a URI has *never* been allowed to contain an unescaped space. Unless I'm misreading RFC 3986 grammar http://www.ietf.org/rfc/rfc3986.txt (and the grammar older URI spec RFC 2396), http://www.ietf.org/rfc/rfc2396.txt spaces in a URI *must* be percent-encoded. In fact, I am dismayed to see that some recent browsers are now incorrectly displaying URIs as containing spaces, instead of %20's, thus misleading people into thinking that URIs can contain spaces. Can you clarify? David On Thu, 2012-07-19 at 11:57 +0100, Graham Klyne wrote: > With reference to http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/#section-IRIs > > And in particular to the note: > [[ > Previous versions of RDF used the term “RDF URI Reference” instead of “IRI” and > allowed additional characters: “<”, “>”, “{”, “}”, “|”, “\”, “^”, “`”, ‘“’ > (double quote), and “ ” (space). In IRIs, these characters must be > percent-encoded as described in section 2.1 of [URI]. > ]] > > I have a concern that this change may lead to incompatibility with deployed > software, and consequent failure of interoperability. > > Currently, the W3C RDF validator, Python rdflib and Jena libraries all allow > and/or generate RDF with URIs that contain unescaped spaces (and presumably > other characters). > > This note suggests that spaces (and other characters) must be %-escaped before > being serialized into RDF 1.1, where current practice, as far as I can discern, > is to assume that RDF carries character-encoded URIs that are not %-escaped. > > For the following, I assume that RDF 1.1 is intending to say that spaces MUST be > %-escaped in URIs used as RDF node identifiers. > > Suppose I implement a service that accepts RDF from its clients. What is it to > do about URIs containing unescaped spaces? > > If it rejects them as ill-formed, then it fails compatibility with existing > clients that provide RDF 1.0 compatible data. > > If it applies %-escaping to non-URI-valid characters, this will result in > double-escaping of RDF data from RDF 1.1 clients, something that RFC3986 says > must be guarded against (http://tools.ietf.org/html/rfc3986#section-2.4), and > may fail to recognize as equal URIs that should be equal provided by RDV 1.0 and > RDF 1.1 clients. > > **** > > And a nit: the [IRI] reference in this document actually links to the URI spec. > > #g > > > > -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Received on Thursday, 19 July 2012 13:53:04 UTC