Re: RDF 2.0 Wishlist - Legal RDF which I can't SPARQL from Jeremy Carroll on 2010-08-06 (semantic-web@w3.org from August 2010)

From: Jeremy Carroll <jeremy@topquadrant.com>
Date: Fri, 06 Aug 2010 14:07:30 -0700
To: Mischa Tuffield <mischa.tuffield@garlik.com>
CC: Semantic Web <semantic-web@w3.org>
Message-ID: <4C5C7992.2090603@topquadrant.com>
  On 8/6/2010 2:45 AM, Mischa Tuffield wrote:
>> Late to this party, I have very little sympathy with Mischa's issue.
>>
>> First I would draw attention to the small print in RDF Concepts ...
>>
>> [[
>> *Note:* this section anticipates an RFC on Internationalized Resource 
>> Identifiers. Implementations may issue warnings concerning the use of 
>> RDF URI References that do not conform with [IRI draft 
>> <http://www.w3.org/TR/rdf-concepts/#ref-iri>] or its successors.
>> ]]
>> we knew there may be changes - like the space issue, and this small 
>> print was intended to (somewhat naughtily) include changes made 
>> elsewhere in the future in the 2004 document.
>
> I must have missed the subtle message by not reading in between the 
> lines there.
>

Yes - as I said naughty of us.
The IRI people hadn't finished their work, and we were not going to wait 
for them, but logically IRI is foundational and RDF is the next layer up.


>>
>> If I have understood Mischa correctly, the problem is that it is 
>> possible to enter illegal IRIs into a triple store in some fashion 
>> (e.g. turtle, and then stuff doesn't work. Surprise, surprise: 
>> garbage in, garbage out.
>
> I think blaming turtle is harsh, as far as my reading of the spec 
> goes, I can make use of URIs (with for example with a ` inside) in a 
> valid RDF/XML document (as well as in turtle),

There is a long tradition, which I do not like, of not validating URLs 
but doing the best one can. When, as in SemWeb, the IRIs are the key 
identifiers, validating them as much as possible is my comfort zone. I 
believe I am in a minority position here.
The RDF/XML spec is clearly in error in that it depends on this half-way 
house concept RDF URI Reference.

The turtle draft
http://www.w3.org/TeamSubmission/turtle/#relativeURI
in my view needs polishing in this area. It normatively refers to URI 
and IRI specs, but doesn't make use of them in the text, except for a 
reference to the base URI mechanism of the URI spec.
The grammar given for URIs (ucharacter*) is way too liberal resulting in 
probable impossible to resolve issues with ill-formed relative URIs.

> which I can then import into a triplestore. If i tried to import the 
> same triples into my triplestore using an "INSERT DATA" sparql update 
> call, I will get an error back. I take this is due to the fact that 
> SPARQL and RDF/XML (as this is the only RDF rec I am familiar with par 
> - as I am not that well versed in RDFa) have different notions of what 
> their URIs can be.
>> Solution: use a triple store that validates its input and rejects 
>> garbage; tackle the problem at source.
>
> The triplestore I use, correctly validates legal RDF (as per turtle 
> and RDF/XML specs by allowing URIRefs), and also correctly validates 
> SPARQL as per spec (by only allowing IRIs).

My view is that a triple store should validate IRIs against IRI spec.


>
>> If the turtle spec permits illegal IRIs then that is a bug with the spec.
>
> Well, this logic makes me think that there is a bug in the RDF/XML 
> spec too then?
>
>> If a turtle implementation allows illegal IRIs then that may be a 
>> feature, but one that needs to be used with care.
>
> And yes agreed, my whole point is that given the current use of IRIs 
> and URIRefs in the SPARQL and RDF/XML specs respectively, one needs to 
> very careful when developing software.

Future looking advice - conform with IRI spec.


>
>> Dogmatically
>>
>> Jeremy
>
> Mischa
>
>>
>>
>>
>
> ___________________________________
> Mischa Tuffield PhD
> Email: mischa.tuffield@garlik.com <mailto:mischa.tuffield@garlik.com>
> Homepage - http://mmt.me.uk/
> Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
> +44(0)845 645 2824 http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>
Received on Friday, 6 August 2010 21:07:59 UTC