IRI vs. URI Reference in RDFa

BCC-cross-posted to: SWCG, RDFa WG

There is an RDF/RDF Web Apps/RDFa coordination issue that the RDF Web
Apps group needs to have resolved in order to take RDFa Core 1.1 and
XHTML+RDFa 1.1 into Candidate Recommendation. We are requesting input
from RDF WG and coordination help from SW CG

The basic question is what should an RDFa processor do when it comes
across a value in an HTML document that looks like this:

<a rel="foaf:homepage"
   href="http://www.schweizer-küche.de/">Schweizer Küche</a>

The issue is being tracked here (raised by Mischa):

http://www.w3.org/2010/02/rdfa/track/issues/87

We had a very long conversation about it on the telecon last week:

http://www.w3.org/2010/02/rdfa/meetings/2011-05-12#ISSUE__2d_87__3a__IRI_vs__2e__URI_References

So the question is whether or not the markup above should generate this:

<> foaf:homepage <http://www.schweizer-küche.de/> .

or should generate this:

<> foaf:homepage <http://www.xn--schweizer-kche-qsb.de/> .

There were good arguments both ways, but I believe that the RDFa WG
settled on the RDFa processor not modifying the URL value when
generating the triples for two reasons:

1) Punycoding URLs could change the meaning of the triple such that
   matching rules written by the author would no longer match.
2) Punycoding URLs are culturally imperialistic - most of the world's
   primary languages cannot be expressed in ASCII, we shouldn't
   force punycoding on all languages "other than English".
3) Modifying URLs away from the authors intent, or away from well-known
   transforms like relative-IRI to absolute-IRIs or normalized
   IRIs, is bad. We shouldn't attempt to guess what the author meant.
4) IDN is a hack and should be dragged into the street and shot
   (ok, so this is just my opinion :P)

So the general assertion is that RDFa Processors should only perform the
following transformation on IRIs:

1. Relative to Absolute IRI transformation.

That is, they shouldn't punycode and they shouldn't attempt to do any
other processing on the IRI output by the processor. In other words,
RDFa Processors shouldn't second-guess the document author. Thoughts?

-- manu

PS: This also, tangentially, re-opens the can of worms on equivalence
testing for IRIs in RDF. Is http://example.com/ros&#xE9 the same as
HTTP://example.com/ros%C3%A9 for equivalence testing in RDF?

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: PaySwarm Developer Tools and Demo Released
http://digitalbazaar.com/2011/05/05/payswarm-sandbox/

Received on Thursday, 19 May 2011 05:33:59 UTC