- From: Alex Hall <alexhall@revelytix.com>
- Date: Mon, 7 Mar 2011 15:35:38 -0500
- To: RDF Working Group WG <public-rdf-wg@w3.org>
- Message-ID: <AANLkTimdDFMWMB7g3vC2+D5Kp1JbkHpKFVrLsqGg9FRK@mail.gmail.com>
On Mon, Mar 7, 2011 at 11:12 AM, Mischa Tuffield <mischa.tuffield@garlik.com > wrote: > Hello, > > <snip/> > > On 5 Mar 2011, at 15:26, Pat Hayes wrote: > > > On Mar 5, 2011, at 5:19 AM, RDF Working Group Issue Tracker wrote: > > > RDF-ISSUE-8 (IRI vs URI): Incorporate IRI-s into the RDF documents [Cleanup > tasks] > > > http://www.w3.org/2011/rdf-wg/track/issues/8 > > > Raised by: Ivan Herman > > On product: Cleanup tasks > > > The IRI Spec[1] is from 2005, and it may be necessary to retrofit it to > RDF. Eg, what is the relationship between "http://résumé.example.org" and > "http://xn--rsum-bpad.example.org"? Are they the same resource or not? > Note that SPARQL has something on that[2]... > > Context matters here. "http://xn--rsum-bpad.example.org" is the URI mapped from the IRI "http://résumé.example.org<http://xn--rsum-bpad.example.org>" but it is also a valid IRI in its own right (I think -- correct me if I'm wrong). If you're dereferencing the resource to fetch its representation then I think you can safely conclude that those represent the same resource, but that decision is up to your application. However, from the perspective of RDF semantics I think it would be wrong to put the burden on the implementer to consider normalization when computing term equality, graph equivalence, etc. This is already an issue to some extent; see the note in RDF Concepts [1] that says: "Because of the risk of confusion between RDF URI references that would be equivalent if derefenced, the use of %-escaped characters in RDF URI references is strongly discouraged. See also the URI equivalence issue<http://www.w3.org/2001/tag/issues.html#URIEquivalence-15>of the Technical Architecture Group." Nowhere in either the RDF or SPARQL specs do I see anything that implies applications should normalize URIRefs when comparing them; they all seem to specify a simple string comparison of the URIRefs. Likewise, I think that " http://xn--rsum-bpad.example.org" and "http://résumé.example.org<http://xn--rsum-bpad.example.org>" when taken as IRIs should be considered different terms/nodes/resources/whatever you want to call them. > > SPARQL says "IRI (corresponds to the Concepts and Abstract Syntax term "RDF > URI reference")" > > > As far as I am aware, URI Ref definition came out before the RFC defining > IRI. They are "pretty similar" insofar as the URIRef work was second > guessing what IRIs would be, but they didn't managed to get it 100% > correct. > > > Is this strictly correct? That is, are IRIs in fact just URI references by > another name? If not (as I suspect) can anyone briefly outline the points of > difference? > > > No, they are not the same thing, the differences lie in terms of what > characters get encoded and which don't. One example is the backtick > character `, which doesn't need to be % encoded when creating an IRI but it > does need to be when generating a URI Ref. I sent an email to the SWIG > mailing list about this a while back [1], whereby people pointed out the > history, and some of the subtle differences between the two. > In addition to the encoding differences, note that the RFC defining IRIs (RFC3987) is based on a more recent URI definition (RFC3986). However, RDF Concepts calls out an old definition of URI (RFC2396) when defining URIRefs. Among other differences, this old definition does not allow percent-encoded characters in the host component, while IRIs and new-style URIs do allow internationalized domain names. So there seems to be a whole class of IRIs that, strictly speaking, are not representable as RDF URIRefs under the current definition. (My apologies if this has been re-hashed elsewhere, I'm somewhat new to this discussion.) -Alex [1] http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref > > Mischa > > [1] http://lists.w3.org/Archives/Public/semantic-web/2010Jul/0426.html > > > Pat > > > [1] http://www.ietf.org/rfc/rfc3987.txt > > [2] http://www.w3.org/TR/rdf-sparql-query/#docTerminology > > > > > > > ------------------------------------------------------------ > IHMC (850)434 8903 or (650)494 3973 > 40 South Alcaniz St. (850)202 4416 office > Pensacola (850)202 4440 fax > FL 32502 (850)291 0667 mobile > phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > > > > > ___________________________________ > Mischa Tuffield PhD > Email: mischa.tuffield@garlik.com > Homepage - http://mmt.me.uk/ > Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW > +44(0)845 652 2824 http://www.garlik.com/ > Registered in England and Wales 535 7233 VAT # 849 0517 11 > Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD > >
Received on Monday, 7 March 2011 20:36:12 UTC