- From: Hausenblas, Michael <michael.hausenblas@joanneum.at>
- Date: Wed, 30 Jul 2008 13:04:57 +0200
- To: "Mark Birbeck" <mark.birbeck@webbackplane.com>, "Manu Sporny" <msporny@digitalbazaar.com>
- Cc: "RDFa mailing list" <public-rdf-in-xhtml-tf@w3.org>
I do by and large agree with Mark (i.e. his excellent description, below) BUT in the same moment I'd like to point out the 'Opacity Axiom' [1], [2]. Please note as well that in RDF we talk about URIrefs [3] and *relative URIs are not used in an RDF graph* - FWIW, I'm happy to take an action to evaluate how other RDF serialisations (e.g. RDF/XML, or upcoming such as Turtle [4]) are dealing with this situation. Cheers, Michael [1] http://www.w3.org/DesignIssues/Axioms.html#opaque [2] http://www.w3.org/TR/webarch/#uri-opacity [3] http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref [4] http://www.w3.org/TeamSubmission/turtle/ ---------------------------------------------------------- Michael Hausenblas, MSc. Institute of Information Systems & Information Management JOANNEUM RESEARCH Forschungsgesellschaft mbH http://www.joanneum.at/iis/ ---------------------------------------------------------- >-----Original Message----- >From: public-rdf-in-xhtml-tf-request@w3.org >[mailto:public-rdf-in-xhtml-tf-request@w3.org] On Behalf Of >Mark Birbeck >Sent: Tuesday, July 29, 2008 11:13 PM >To: Manu Sporny >Cc: RDFa mailing list >Subject: Re: RDFa test suite addition > > >Hi Manu, > >> Right, I didn't mean to imply that 'appending' will work in all cases >> (even though I'm not convinced that the statement is not true). > >You began this thread by saying that the bug in librdfa was that in >some circumstances the relative part was incorrectly being appended to >the document, rather than the host name. So haven't you proved it >yourself, that appending doesn't work in all cases? :) > >What's happened is that you've now found two possible ways to append >(to the document part or to the hostname), but I'm afraid that the >algorithm for converting a relative URI to an absolute one involves >yet further possibilities. > >For example, the relative path: > > sneaking_sally.mp3 > >should be appended to the end of the *path* part, replacing the >document. And so on. > >So the point is to use the 'proper' algorithm for turning a relative >path into an absolute one, and you will always be ok, no matter what >the URI is that you are dealing with (relative or not). > >The big question then, is whether the spec actually says to do this. > > >> What you have said has got me wondering about what is correct, >> acceptable and incorrect, however. > >You also had me wondering, too. I recalled investigating this quite a >long time ago, and was starting to panic that I hadn't actually >incorporated what I learned from my analysis into the spec. > >But thankfully I did: > > 5.4. CURIE and URI Processing > > Since RDFa is ultimately a means for transporting RDF, then a key >concept is the > resource and its manifestation as a URI. Since RDF deals with >complete URIs (not > relative paths), then when converting RDFa to triples, any relative >URIs will need to > be resolved relative to the base URI, using the algorithm defined in >section 5 of RFC > 3986 [URI], Reference Resolution. > >It certainly sounds like this point could do with being made more >prominent, but hopefully you'll agree that such changes would merely >be editorial, and that the spec itself is correct. > >(See below for a further mention in the spec of this issue, but in the >context of CURIEs.) > > >>> ><http://rdfa.digitalbazaar.com/fuzzbot/demo/../../live/sneaking >_sally.mp3> >> >> I realize that the URL above is not optimal, but is it >"wrong"? RFC-1738 >> says that the URL is valid (if I'm reading the RFC correctly): >> >> ftp://ftp.isi.edu/in-notes/rfc1738.txt > >First, note that [1] updates RFC 1738. > >Second, you're right that the URI is not 'wrong'. But the only way to >obtain such a URI would be to enter it exactly as you have shown it. >I.e., tou can't create such a URI by beginning with a relative path >and making it absolute, since the the only way to do that is according >to section 5 of [1], and that algorithm clearly shows how the dot >segments would be removed. > >But also, if you query to your triple store for everything the store >knows about this: > > <http://rdfa.digitalbazaar.com/live/sneaking_sally.mp3> > >will you also get back information about: > > ><http://rdfa.digitalbazaar.com/fuzzbot/demo/../../live/sneaking >_sally.mp3> > >If you do, then that's great...but I'd also be really surprised; I >would imagine that once the URI is in the store, it's treated pretty >much like a string. > > >> Is it the RDFa parser's job to normalize URLs? I can >certainly see the >> argument for why it should tidy up URLs, but I don't think >this is a MUST. > >I think it should, for two reasons, one concerning RDFa in general, >and the other relating to its particular manifestation as XHTML+RDFa. > >The first reason is that RDF deals with absolute URIs. So any relative >paths have to be made absolute somehow, when creating triples. RFC >3986 [1] has a simple algorithm for doing this, which also has the >effect of removing dot segments. > >So if we were not to use that algorithm to make relative paths >absolute, which algorithm would we use? As you've discovered, simple >concatenation doesn't work, since you keep finding another relative >path that messes you up. > >The second reason is that XHTML+RDFa is a layer on top of XHTML. So >what we're doing is giving a semantic *interpretation* of the >underlying XHTML. To make this useful, we should really be generating >the same triples for the same semantics. And if I say that the >resource: > > <http://rdfa.digitalbazaar.com/live/sneaking_sally.mp3> > >is 5 minutes long, then the manner I use to express that at the XHTML >level shouldn't affect the semantics that are generated. > >(As an aside, when parsing in HTML browsers, if you request the value >of @href using getAttribute(), some browsers will give you the full, >absolutised path, relative to the 'base' of the document and others >will give you the original value put in there by the author, which >could contain dot segments. So in those parsers you have to normalise, >otherwise you won't achieve browser consistency.) > > >> If it's not a MUST, then we find ourselves in a position where the >> application/inference engine MUST normalize the URLs coming >in from the >> RDFa parser. > >It's not really 'normalising', it's using the proper algorithm to turn >a relative path into an absolute one. That algorithm takes care of >'.', '..', and all sorts of other things. > >Anyway, we have it in the spec, but you are right that we should >perhaps consider making the wording both clearer and stronger. > > >> Take this CURIE as an example: >> >> <span xmlns:ex="http://example.org/2008-10-24/docs/api/" >> about="[ex:../ref/a.html]">...</span> >> >> a bit contrived, but would you say that the parser should >output this URI: >> >> http://example.org/2008-10-24/docs/api/../ref/a.html >> >> or this one: >> >> http://example.org/2008-10-24/docs/ref/a.html > >The latter. > >Section 5.4.2, "Converting a CURIE to a URI" describes the >following algorithm: > > Since a CURIE is merely a means for abbreviating a URI, its value is >a URI, rather > than the abbreviated form. Obtaining a URI from a CURIE involves the >following steps: > > 1. Split the CURIE at the colon to obtain the prefix and the >resource. > 2. Using the prefix and the current in-scope mappings, >obtain the URI that the > prefix maps to. > 3. Concatenate the mapped URI with the resource value, to obtain an >absolute URI. > >After that description you'll see that there is a blue box that refers >back to the earlier point about what it means to create absolute URIs >from relative ones: > > Note that it is generally considered a good idea not to use relative >paths in namespace > declarations, but since it is possible that an author may ignore >this guidance, it is further > possible that the URI obtained from a CURIE is relative. However, >since all URIs must > be resolved relative to [base] before being used to create triples, >the use of relative paths > should not have any effect on processing. > >Now this doesn't quite deal with the example you gave; I was more >dealing with this: > > <span xmlns:ex="/2008-10-24/docs/api/" > about="[ex:../ref/a.html]">...</span> > >which when concatenated still only gives a relative path: > > /2008-10-24/docs/api/../ref/a.html > >The point that I was trying to stress when I wrote this was that this >would still be ok, provided that you always use the algorithm in [1], >and that algorithm would also take care of your example. > >However, I agree again that it wouldn't hurt to make this point more >forcefully, but again, I think this is just about stress in the prose, >rather than a fundamental issue. > > >> If our argument is that CURIEs are simple concatenations, at >what point >> in the process is the "strange URL" converted into the >"normalized URL"? > >I do my normalisation in the parser, before passing the >results to the store. > > >> If we do think it should be the parser that normalizes URLs, we don't >> have such a statement in the RDFa Syntax document, do we? > >I think we do, as described above, re the note in 5.4.2 > >Regards, > >Mark > >[1] <http://gbiv.com/protocols/uri/rfc/rfc3986.html> > >-- >Mark Birbeck, webBackplane > >mark.birbeck@webBackplane.com > >http://webBackplane.com/mark-birbeck > >webBackplane is a trading name of Backplane Ltd. (company number >05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, >London, EC2A 4RR) > >
Received on Wednesday, 30 July 2008 11:09:48 UTC