Re: [JSON] URI vs IRI from Eric Prud'hommeaux on 2011-03-30 (public-rdf-wg@w3.org from March 2011)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Wed, 30 Mar 2011 06:20:15 -0400
To: Dan Brickley <danbri@danbri.org>
Cc: RDF WG <public-rdf-wg@w3.org>
Message-ID: <20110330102014.GB10805@w3.org>

* Dan Brickley <danbri@danbri.org> [2011-03-30 11:59+0200]
> As I understand / dimly remember, the RDFCore round of specs tried
> their best to anticipate the IRI specs, but could only make normative
> reference to the URI spec.
> 
> http://www.w3.org/TR/rdf-concepts/
> 
> "Note: this section anticipates an RFC on Internationalized Resource
> Identifiers. Implementations may issue warnings concerning the use of
> RDF URI References that do not conform with [IRI draft] or its
> successors."
> 
> ...whereas http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/
> does not mention IRIs.
> 
> Meanwhile http://www.w3.org/TeamSubmission/turtle/ "Turtle uses IRIs
> as term identifiers."
> 
> For JSON my assumption has been that we would use IRI. Can this be confirmed?
> 
> At the POI WG F2F we are looking at an example link to the page for
> Amsterdam in the Korean Wikipedia. I hope these come through the list
> OK.
> 
> 1. the pretty link appears in Korean script (to me at least).
> 
> {
>     "url": "http://ko.wikipedia.org/wiki/암스테르��"
> }
> 
> 2. if this is escaped so as to be a pre-IRI URI, we get instead an
> ugly string, twice as many chars:
> 
> {
>     "url": "http://ko.wikipedia.org/wiki/%EC%95%94%EC%8A%A4%ED%85%8C%EB%A5%B4%EB%8B%B4"
> }
> 
> I'm agnostic for now, on question of where-or-whether this stuff gets
> canonicalised. But I would like to express a preference that verbose
> URI escape sequences are not imposed on eg. Korean URLs like the one
> given here.

SPARQL's position is that RDF nodes look like
<http://ko.wikipedia.org/wiki/암스테르��>, and if you want to GET
them, you follow the %ification rules in RFC2397. That was a fairly
bold and necessary step, IMO with a downside if folks have RDF data
with nodes already in the %-y form. If so, your IRI assertions about
<wiki/암스테르담> and their URI assertions about <wiki/%EC%95%94%EC…>
will describe different resources.

A still rarer problem occurs with when we try to GET their URIs and
those URIs are for resources which actually include %XX in the (file)
name. This is a pretty contrived screw case, and couldn't happen with
modern web servers which %-decode the incoming URIs before resolving
them against the filesystem or rewrite rules, etc.


> cheers,
> 
> Dan
> 

-- 
-ericP

Received on Wednesday, 30 March 2011 10:20:50 UTC