- From: Mark Birbeck <mark.birbeck@webbackplane.com>
- Date: Wed, 1 Apr 2009 10:15:37 +0100
- To: RDFa <public-rdf-in-xhtml-tf@w3.org>, "public-rdfa@w3.org" <public-rdfa@w3.org>
Hello all, I have an (old) action item to explain why we have the rule for setting @about on head/body. I'll explain it, and then also flag up some problems with it. The background is this; say you navigate to the following URL in your browser: <http://a.b/c/d.e#f> You don't want the RDFa parser to use that full URL for generating triples, because it means you'll get a different set of triples depending on how you navigate to that page: <http://a.b/c/d.e#g> <http://a.b/c/d.e#h> <http://a.b/c/d.e#i> etc. So instead, we want to say that any fragment identifiers are removed when using the URL as a subject. However, creating such a rule is a little protocol-specific -- the document could in principle come from anywhere -- so instead I added rules that effectively coerce the first subject to being an absolute URL, created from the relative URL of "". The reason this works is this. Say you have an algorithm for creating an absolute URI, which takes a base path and the path to convert: makeAbsolute(base, uri) If you now feed this function the base URI from our example, and the relative path of "", then the following *must* be true, according to RFC 3986: makeAbsolute("http://a.b/c/d.e#g", "") == "http://a.b/c/d.e" In other words, saying that there is an implied @about="" becomes a protocol independent way of tidying up the URL. (As it happens, a parser is more likely to do this: baseURI = makeAbsolute("http://a.b/c/d.e#g", "") subjectURI = makeAbsolute(baseURI, "") so the 'tidying up' was done on the base URL, but the effect is the same.) Now, although the principle seems sound, the rules defined in order to achieve it might be causing some problems. The first is one that I think was flagged up by Ivan a while ago, but I'll list it here to jog your memories; if you put a subject onto the root (most likely the HTML element), then your subject gets overridden when parsing hits the <head> and <body>: <html about="http://somewhereelse.com/"> ... </html> Perhaps we could just live with that, but advise people that if they want to do this they should really be using <base>. But either way, it's still a quirk. The second issue is the use of @typeof on <body> or <head>; I have a vague recollection this also came up in Ivan's example, but I might be wrong, but either way, it also came up for me today when I was asked to check someone's RDFa documents. They have this in their document: <body typeof="foaf:Document"> ... </body> My parser gave this triple: <> a <http://xmlns.com/foaf/0.1/Document> . and since I was expecting the subject to be a bnode, I assumed there was a bug in my parser. However, looking at the spec I see that we do indeed place the 'implied @about' at a higher level than the bnode: 4. If the [current element] contains no @rel or @rev attribute, then the next step is to establish a value for [new subject]. Any of the attributes that can carry a resource can set [new subject]; [new subject] is set to the URI obtained from the first match from the following rules: @about...@src...@resource...@href, etc.; If no URI is provided by a resource attribute, then the first match from the following rules will apply: if the element is the head or body element then act as if there is an empty @about present, and process it according to the rule for @about, above; if @typeof is present, obtained according to the section on CURIE and URI Processing, then [new subject] is set to be a newly created [bnode]. otherwise, if [parent object] is present, [new subject] is set to the value of [parent object]. Additionally, if @property is not present then the [skip element] flag is set to 'true'; It's these last three rules that we're focusing on. I would argue that since the intention of setting @about="" on head/body was simply to 'tidy up' the initial subject so that it didn't have any fragment identifiers, then the rule that achieves this should only be applied if the subject wasn't set in any other way. This would be easily achieved if we moved the rule to the end of the group of three quoted above. That would solve both problems mentioned at the top, because: * if @typeof is used on <head> or <body> then a bnode is created, making it consistent with processing in other situations; * if there is a parent subject (i.e., on <html>) then that is used, and no 'implied @about' is needed. You could argue that this then removes an easy way to indicate the type of the *document*, but I think the answer to that is that the reordered rules would simply force you to be explicit; if you want to set the type of the document rather than generating a bnode, then you would simply do this: <body about="" typeof="foaf:Document"> ... </body> I realise changing the spec or issuing an errata is not something that can be taken lightly, so this email is primarily about completing my action item to explain what the implied @about was all about. Then I suppose the next step would be to see whether we just live with the quirks that we have, or whether we want to tidy them up. And if so, we should probably try to find out if anyone is actually producing documents with @typeof on <body> or <head>, and if they are, what's the effect they are trying to achieve. Regards, Mark -- Mark Birbeck, webBackplane mark.birbeck@webBackplane.com http://webBackplane.com/mark-birbeck webBackplane is a trading name of Backplane Ltd. (company number 05972288, registered office: 2nd Floor, 69/85 Tabernacle Street, London, EC2A 4RR)
Received on Wednesday, 1 April 2009 09:16:22 UTC