W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > March 2011

Re: Other issues - RDFa Core 1.1 IRIs vs URIRefs

From: Mischa Tuffield <mischa.tuffield@garlik.com>
Date: Fri, 11 Mar 2011 00:05:54 +0000
Cc: W3C RDFa WG <public-rdfa-wg@w3.org>, Ivan Herman <ivan@w3.org>, nathan@webr3.org
Message-Id: <21190084-2266-4851-B021-4240429E9A04@garlik.com>
To: Shane McCarron <shane@aptest.com>
Shane,

Comments inline. 

On 10 Mar 2011, at 20:55, Shane McCarron wrote:

> Mischa,
> 
> At the risk of confusing myself...  a request for clarification.

My apologies, I will try my best this time. 

>   In RDFa we are concerned with both lexical space and value space of various things.  In particular, CURIEs require that the expansion of the lexical space 'foo:bar' into the value space 'http://whatever...bar' be a valid URI.  There are reasons for this that have to do with resource retrieval, follow-your-nose processing, etc. 

Ok. My impression is that you can write either something like : 

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>My home-page</title>
    <meta property="http://purl.org/dc/terms/creator" content="Mark Birbeck" />
    <link rel="http://xmlns.com/foaf/0.1/topic" href="http://www.example.com/#us" />
  </head>
  <body>...</body>
</html>
or 

<html
  xmlns="http://www.w3.org/1999/xhtml"
  prefix="foaf: http://xmlns.com/foaf/0.1/
          dcterms: http://purl.org/dc/terms/"
  >
  <head>
    <title>My home-page</title>
    <meta property="dcterms:creator" content="Mark Birbeck" />
    <link rel="foaf:topic" href="http://www.example.com/#us" />
  </head>
  <body>...</body>
</html>
i.e. You can have a CURIE, which expands to a valid IRI, or a "URI reference"?

> Are you suggesting that the value space should be a valid IRI, that the value space should be a valid URI (potentially transformed from an IRI as defined in RFC 3987), or something else?  

In the RDFa Core 1.1 [1] document CURIEs are indeed said to be "When expanded ... a syntactically valid URI [RFC3987]". Which is a valid IRI. I don't think there is any issue there. Other than the use of the term "URI", my personal preference would be for it to say "IRI".

The issue here is that when scanning though the document [1] triples are defined in section 3.2 and then this is followed by section 3.3 "URI references" which is not defined in terms of RFC3987. When I read something about RDF and I see the term "URI reference" my immediate thought is to the definition in the RDF Concepts doc [2], as used in RDF/XML[5].

Section 3.8 then adds to my confusion:

"In order to allow for the compact expression of RDF statements, RDFa allows the contraction of most URI references into a form called a 'compact URI expression', or CURIE. A detailed discussion of this mechanism is in the section CURIE and URI Processing.
Note that CURIEs are only used in the markup and Turtle examples, and will never appear in the generated triples, which are defined by RDF to use URI references."

In this section the first paragraph seems to be talking about URI References as defined in the RDFa Core 1.1 document. And the second paragraph states "defined by RDF to use URI references", is this talking about the RDF Abstract Concepts document and its definition of URI References?

And finally in section 3.10 :

"The subject node is always either a URI reference or a blank node (or bnode), the predicate is always a URI reference, and the object of a statement can be a URI reference, a literal, or a bnode."

This is again confusing, but it could just be me. 

I do see that the expansion of CURIEs has been defined correctly, but it would be good to see all of the current RDF docs coming out to all be talking about the same things, IRIs. 

As per the SPARQL Query Language [6] the SPARQL WG they have taken to define SPARQL 1.1 [4] and its RDF Term Syntax [3] in terms of IRIs and not URI References. And, *my personal opinion/hope*, is that Turtle is going to go that way too, i.e. defined in terms of IRIs.

"SPARQL is defined in terms of IRIs [RFC3987]. IRIs are a subset of RDF URI References that omits spaces."

"The set of RDF terms defined in RDF Concepts and Abstract Syntax includes RDF URI references while SPARQL terms include IRIs. RDF URI references containing "<", ">", '"' (double quote), space, "{", "}", "|", "\", "^", and "`" are not IRIs. The behavior of a SPARQL query against RDF statements composed of such RDF URI references is not defined."

Mischa

[1] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html 
[2] http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref
[3] http://www.w3.org/TR/sparql11-query/#syntaxTerms 
[4] http://www.w3.org/TR/sparql11-query/#sparqlBasicTerms 
[5] http://www.w3.org/TR/REC-rdf-syntax/ 
[6] http://www.w3.org/TR/rdf-sparql-query/ 

> 
> Just so we know where to start the discussion...
> 
> On 3/10/2011 2:38 PM, Mischa Tuffield wrote:
>> 
>> Hello RDFa'ers, 
>> 
>> I was asked by Manu to summarise the email I sent to Ivan and him below, for your peoples consideration. Note that, this is a "Last Call comment". 
>> 
>> Quoting Manu : 
>> 
>>> Mischa, could you please summarize and send this feedback to the RDFa
>>> mailing list? You can specify that it is a Last Call comment if you
>>> think that it's imperative that we get this right before going into our
>>> 2nd Last Call.
>>> 
>>> RDFa WG <public-rdfa-wg@w3.org>
>>> 
>>> It's important that the RDFa community and Working Group is aware of
>>> your input and has the information it needs to make a reasonable
>>> decision on the usage of IRI vs. URI vs. URI Reference.
>> 
>> So, I am very new to working group stuff, albeit I have been playing with RDF since '04 when I was a postgrad at Southampton Uni, so please excuse if I get formalities wrong. I should also add that out of all of the RDF serialisations I am least familiar with RDFa. 
>> 
>> I went through the RDFa Core 1.1 doc [1] and I noticed that there are a number of different definitions for what a URI is in the context of RDFa (see previous email in this thread with Ivan forwarded to this list). The document uses the term "URI reference", which in RDF Abstract Concepts terms is defined as [2], but also points to RFC's 3986 [3] and RFC 3987 [4] in the same RDF Core 1.1 document - which is confusing ! The question is which one is the correct definition for a URI in an RDFa document?
>> 
>> From my POV it seems that the RDFa document should be using the IRI definition as per the current SPARQL work; it also seems that the RDF WG is going to update the Turtle spec to talk about IRIs too. Below is my motivation for saying this (cut and pasted from an email to public-rdfwg mailing list at w3). 
>> 
>> Ideally, SPARQL, and the various RDF serialisations should all use the same definition for what a URI is. As far as I am aware URI Refs where defined in an attempt to guess what the IRI definition was going to look like, and should probably be replaced by the newer IRI definition. 
>> 
>>>>> Personally i don't think that the burden of normalising URIs should be on applications. What is key here from my POV is the ability to roundtrip RDF, I will explain what I mean by this. I would like to be certain that if I generate new triples in my triplestore using a SPARQL Update query, and that I can be certain to generate valid RDF including those triples using the CONSTRUCT verb. Otherwise things just get too confusing.
>>>>> 
>>>>> Given that SPARQL is currently in last call, it would be good to be able to unify what URI definitions are used in both the standard serialisations and in the query language. As a developer I would like to use only one library for generating URIs in my application, regardless of whether I am writing SPARQL or RDF. 
>> 
>>>> 
>> 
>> What I would like to avoid, is a situation whereby data can be imported into a triplestore via a SPARQL Update query, which can not subsequently be exported in, lets say RDFa in this case. 
>> 
>> I hope this makes sense/helps, 
>> 
>> Regards, 
>> 
>> Mischa
>> 
>> [1] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html 
>> [2] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-URI-reference 
>> [3] http://www.ietf.org/rfc/rfc3987.txt 
>> [4] http://www.ietf.org/rfc/rfc3986.txt
>> 
>> 
>> On 10 Mar 2011, at 18:25, Nathan wrote:
>> 
>>> Ivan Herman wrote:
>>>> Misha is on the RDF Working Group where we had a discussion on the URI vs IRI issue. He reviewed the Core spec; here is his review v.a.v this stuff.
>>>> Opinions?
>>> 
>>> "URI reference" is the thrower really, because (afaict) we don't mean URI reference ( '../foo' ) we means an "IRI compatible URI", or just "URI" or just "IRI".
>>> 
>>> This time last week we also had URLs in the mix, it would be very good to reference either "URI" exclusively (not "URI reference") or "IRI" exclusively.
>>> 
>>> Which one do we use? if IRI, we should say IRI everywhere.
>>> 
>>> Best,
>>> 
>>> Nathan
>>> 
>>>>> Hi Ivan/Manu, 
>>>>> Sorry for top-posting. The relevant bit of the below thread is when Ivan said to me : 
>>>>>> Actually... there is a revision coming on RDFa. What you should look at, if you can, is 
>>>>>> http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html
>>>>>> 
>>>>>> which is the editor's draft of what will soon be a 2nd last call document for RDFa 1.1. It would be great if you could look at it with a fresh eye with this issue in your mind...
>>>>> 
>>>>> I have had a look at the RDFa 1.1 [1] as asked and have made an observation wrt to how URIs are defined in the document. If you feel like I should be sending this to the public-rdfa-wg mailing list do let me know, and/or do feel free to forward accordingly. 
>>>>> In short, it seems like RDFa Core 1.1 [1] uses IRIs as defined in RFC3987[2], URIs as per RFC3986 [3], and mentions "URI references" (which is the RDF world is defined as an extension to RFC2396 [4] in the abstract syntax document [5]) which is slightly confusing and maybe even a bug (from my POV anyways).
>>>>> 
>>>>> 
>>>>> **So in Section 2 and Section 7.4 of the document describes URIs in terms of RFC3986. 
>>>>> **Section 3.3 - URI references - states: 
>>>>> "RDF solves this problem by replacing our vague terms with URI references."
>>>>> 
>>>>> Note that "URI references" is not defined in this section. 
>>>>> and subsequently in Section 3.10 - A description of RDFa - states: 
>>>>> "The subject node is always either a URI reference or a blank node (or bnode), the predicate is always a URI reference, and the object of a statement can be a URI reference, a literal, or a bnode." which points back to Section 3.3 (as far as I can tell). 
>>>>> ** Section 3.8 - Compact URI Expression - states : 
>>>>> "RDFa allows the contraction of most URI references into a form called a 'compact URI expression" <-- I am not sure which URI reference is mentioned here. 
>>>>> ** Section 6 - CURIE Syntax
>>>>> 
>>>>> Defines URIs as per RFC3987 (which are IRIs) and states : 
>>>>> "When expanded, the resulting URI must be a syntactically valid URI [RFC3987]. " 
>>>>> **And finally, it seems that in section 7.4 CURIE and URI Processing, there is pointers to the IRI spec, RFC3987 which states how relative URIs are resolved wrt to the documents base URI. 
>>>>> From my POV this is confusing, and given that SPARQL are using IRIs (RFC3987), and that the Turtle will probably be defined using IRIs, and *hopefully so will RDF/XML via RDF Abstract Syntax document update, I do feel strongly that RDFa should use the newer IRI definition in all places in the RDFa spec. (Again, please do let me know if you think I am wrong here). 
>>>>> Warmest Regards, 
>>>>> Mischa 
>>>>> [1] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html [2] http://www.ietf.org/rfc/rfc3987.txt [3] http://www.ietf.org/rfc/rfc3986.txt [4] http://www.ietf.org/rfc/rfc2396.txt [5] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-URI-reference 
>>>>> 
>>>>> 
>>>>> On 10 Mar 2011, at 09:42, Ivan Herman wrote:
>>>>> 
>>>>>> Mischa,
>>>>>> 
>>>>>> On Mar 9, 2011, at 19:54 , Mischa Tuffield wrote:
>>>>>> <snip/>
>>>>>>>> 
>>>>>>>>> 2) And whether or not the RDFa spec[1] is in or out of scope of this working group, as it is not listed in the charter as one of the documents which the group will be looking to update[1]? The reason I mention this is again, if we end up in a world where both SPARQL and RDF (lets say the Turtle serialisation) are using IRIs, developers would have to use a different URI encoding library for SPARQL & Turtle, from the one they would be using if there were to be serialising to RDFa. 
>>>>>>>> RDFa is certainly not in the scope of this group, there is a separate group for that one. That being said, afaik RDFa already uses IRIs, just like SPARQL. I explicitly copy this mail to Manu, who is the chair of that group.
>>>>>>> Thanks, and yes I am aware that Manu is the chair of that group. I need to read the entirety of the RDFa rec [1], but it seems like the only place that IRIs are mentioned are in the CURIE section [2], and the rest of the document including [2] talks about URI References and not IRIs. 
>>>>>> Actually... there is a revision coming on RDFa. What you should look at, if you can, is 
>>>>>> http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html
>>>>>> 
>>>>>> which is the editor's draft of what will soon be a 2nd last call document for RDFa 1.1. It would be great if you could look at it with a fresh eye with this issue in your mind...
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Ivan
>>>>>> 
>>>>>>> But, ok, I now understand that RDFa is not in the scope of this group, thanks for the clarification. 
>>>>>>> [1] http://www.w3.org/TR/rdfa-syntax/
>>>>>>> [2] http://www.w3.org/TR/rdfa-syntax/#s_curies [3] http://www.w3.org/TR/rdfa-syntax/#sec_3.10. 
>>>>>>>> Note, however, that RDFa is a bit special in the sence that it "lives" in another environment, namely HTML, which it cannot fully control...
>>>>>>> Understood. 
>>>>>>> Regards, 
>>>>>>> Mischa
>>>>>>> 
>>>>>>>> Cheers, 
>>>>>>>> Ivan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Regards, 
>>>>>>>>> Mischa *goes off to look into the back-compatibility of URIRefs to IRIs (any pointers existing work comparing the definitions would be much appreciated)
>>>>>>>>> 
>>>>>>>>> [1] http://www.w3.org/TR/rdfa-syntax/
>>>>>>>>> [2] http://www.w3.org/2011/01/rdf-wg-charter#deliverables [3] http://www.w3.org/TR/rdfa-syntax/#T_URI_reference 
>>>>>>>>> ___________________________________
>>>>>>>>> Mischa Tuffield PhD
>>>>>>>>> Email: mischa.tuffield@garlik.com
>>>>>>>>> Homepage - http://mmt.me.uk/
>>>>>>>>> Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
>>>>>>>>> +44(0)845 652 2824  http://www.garlik.com/
>>>>>>>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>>>>>>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>>>>>>>> 
>>>>>>> ___________________________________
>>>>>>> Mischa Tuffield PhD
>>>>>>> Email: mischa.tuffield@garlik.com
>>>>>>> Homepage - http://mmt.me.uk/
>>>>>>> Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
>>>>>>> +44(0)845 652 2824  http://www.garlik.com/
>>>>>>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>>>>>>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>>>>>>> 
>>>>>> 
>>>>>> ----
>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> ___________________________________
>>>>> Mischa Tuffield PhD
>>>>> Email: mischa.tuffield@garlik.com
>>>>> Homepage - http://mmt.me.uk/
>>>>> 
>>> 
>> 
>> ___________________________________
>> Mischa Tuffield PhD
>> Email: mischa.tuffield@garlik.com
>> Homepage - http://mmt.me.uk/
>> Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
>> +44(0)845 652 2824  http://www.garlik.com/
>> Registered in England and Wales 535 7233 VAT # 849 0517 11
>> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
>> 
> 
> -- 
> Shane P. McCarron                          Phone: +1 763 786-8160 x120
> Managing Director                            Fax: +1 763 786-8180
> ApTest Minnesota                            Inet: shane@aptest.com
> 

___________________________________
Mischa Tuffield PhD
Email: mischa.tuffield@garlik.com
Homepage - http://mmt.me.uk/
Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
+44(0)845 652 2824  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD



Received on Friday, 11 March 2011 00:06:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 04:55:09 GMT