W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2010

Re: [TF-LIB] Encoding for IRIs

From: Mischa Tuffield <mischa.tuffield@garlik.com>
Date: Thu, 29 Jul 2010 20:31:54 +0100
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <434BE20C-E7DF-42D8-A122-ED08A05512BB@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
<snip/>

On 29 Jul 2010, at 20:22, Andy Seaborne wrote:

> fn:encode-for-uri encodes a string for use as a path segment in
> 
> "http://example/00/Weather/CA/Los%20Angeles#ocean"
> ==>
> "http%3A%2F%2Fexample%2F00%2FWeather%2FCA%2FLos%2520Angeles%23ocean"
> 
> which as F&O puts it: "This is probably not what the user intended because all of the delimiters have been encoded."
> 
> Do we want to have a similar function that encodes for an IRI?
> 
> If so, we have introduced a new function - to date, the TF-LIB list is the functions from F&O that make sense.  We have no new functions except the term constructors IRI, BNODE, STRDT, STRLANG, which are fundamental.  Adding new functions that are practically motivated gets slightly tricky as to where to stop.
> 
> We could make IRI() do that but I don't think that is a good idea. Currently, it captures the concept of turning strings into IRIs. Mangling the characters may be incorrect as the app might expect to use the IRI it gets back in a later query to get a match, and it won't.
> 
> 	Andy
> 
> This was triggered by the thread from
> 
> http://lists.w3.org/Archives/Public/semantic-web/2010Jul/0425.html
> 
> discussing practicalities of working with some bad-IRI data.
> 
> The original question was how to ask a SPARQL query when the data contains:
> 
> <http://example.com/mylamefoafdocument`uri> a foaf:Document .
> > <http://example.com/mylamefoafdocument`uri> foaf:primaryTopic <http://example.com/mylamefoafdocument`uri#me> .
> 
> which is legal Turtle in the sense it bases the basic syntax grammar rules.  The Turtle grammar allows anything in <> except '>' and #x3E which is '>' encoded. The Turtle test suite has positive tests for Turtle documents having bad IRIs e.g. \n.

Given that Turtle is not a Rec, it should be noted that my reading of the RDF/XML Rec is that it too also allows for the following bad-IRI to be used as a URI : http://example.com/mylamefoafdocument`uri

My reading (excuse me if I am wrong here), is that RDF/XML talks about the use of 'RDF URI references' in "Section 5.2 Identifiers" of the rec [1] which in turn points to section 6.4 of the RDF Abstract Syntax [2] which seems to allow for the backticks to be used in URIs in RDF/XML. 

Cheers, 

Mischa


[1] http://www.w3.org/TR/REC-rdf-syntax/#section-Identifiers
[2] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-URI-reference


> 
> The issue for TF_LIB does not address the original question from Mischa but it does bring up this nearby issue.

___________________________________
Mischa Tuffield PhD
Email: mischa.tuffield@garlik.com
Homepage - http://mmt.me.uk/
Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
+44(0)845 645 2824  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Thursday, 29 July 2010 19:32:32 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:43 GMT