Re: RDFa RFE: No Mandated DOCTYPE from Sean B. Palmer on 2007-11-22 (public-rdf-in-xhtml-tf@w3.org from November 2007)

From: Sean B. Palmer <sean@miscoranda.com>
Date: Thu, 22 Nov 2007 11:53:39 +0000
To: "Mark Birbeck" <mark.birbeck@formsplayer.com>
Cc: public-rdf-in-xhtml-tf@w3.org
Message-ID: <b6bb4d890711220353k30dcdeb3gffed2007985de081@mail.gmail.com>
On Nov 22, 2007 10:19 AM, Mark Birbeck wrote:

> On your point about the profile attribute, it is intended that it
> will work in exactly the way you describe, and it should have
> appeared in the spec. One point to clarify though, is that setting
> @profile to include an RDFa identifier is not mandatory.

That is reasonable, but I think that extra orthogonal provisions will
be required as well: using @profile being a SHOULD, and user-agents
being absolved from having to parse RDFa documents that don't specify
it.

The problem that I have as an implementor is that if you look at, for
example, the GRDDL user-agents conformance section, it's pretty heavy:

http://www.w3.org/TR/grddl/#sec_agt
- 7. GRDDL-Aware Agents

It says that user-agents *should* apply all four of the processing
steps defined earlier in the specification. This is a burden that I
don't want to be repeated in RDFa.

To give you a concrete example, in an API I'm working on I have a
Graph class which takes a URI as a constructor which, if passed with
no other arguments, dereferences the URI and tries to get triples from
it in any way possible. This means that first you have to do encoding
detection (not easy in itself), and then you can normally dispatch
from media types...

* application/rdf+xml -> parse as RDF/XML
* application/x-turtle -> parse as Turtle
* text/rdf+n3 -> parse as Notation3

But when you get one of the generic xml types, e.g. application/xml,
or worse still text/html, the rules change. With application/xml if
the root namespace is the RDF namespace, then you parse it as RDF/XML.
But an application/xml document may also be a GRDDL document; and in
fact Bijan Parsia this morning just commented that GRDDLing RDF/XML
documents may be a way to do datatype coercion.

The burden is where you have a URI, and you know it gives triples, but
you can't be sure that the author isn't going to use eRDF one day,
change their mind and use some GRDDL hDialect the next, and then RDFa
the next. So you have to try *all* of the available mechanisms to be
safe.

Now, if there were some well defined heuristics for telling which
possible transformations might apply, that would greatly reduce the
burden on the suite of parsers required to handle all this.

> We feel that having an identifier available, but not making it
> compulsory, gives the best of both worlds

Yup, I understand why it's not always possible to use @profile.
Recently I proposed [1] to semantic-web and the GRDDL WG that you can
shoehorn the GRDDL mechanism quite well into existing HTML idioms in
fact. So for example, instead of doing this:

<head profile="http://www.w3.org/2003/g/data-view">
<link rel="transformation"
   href="http://danja.talis.com/glink/groklinks.xsl" />
<link rel="transformation"
   href="http://www.w3.org/2006/vcard/hcard2rdf.xsl" />

You could do this:

<head>
<link rel="stylesheet" type="application/rdf+xml" href="style.rdf" />

Where style.rdf is something like the following:

<Stylesheet xmlns="http://example.org/style#"
   xmlns:gs="http://example.org/grddl-style#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<styles rdf:parseType="Resource">
 <gs:transform rdf:resource="http://danja.talis.com/glink/groklinks.xsl"/>
 <gs:transform rdf:resource="http://www.w3.org/2006/vcard/hcard2rdf.xsl"/>
</styles>
</Stylesheet>

But interestingly I didn't get a single response about this so far!
Either I didn't express myself very well, or people aren't interested
in discussing ways of reducing the burden on GRDDL document producers.

At any rate, I am interested in discussing such things and often think
about it, so don't get me wrong--I do understand why @profile
shouldn't necessarily be mandatory, but I would like the core
user-agent conformance class to exclude documents that don't use
@profile.

To go back to my API, how it would manifest there is that when you do...

Graph('http://example.org/')

It loads in a "reasonable" amount of algorithms for trying to get RDF
blood out of the HTTP representation stone, which would correspond, I
hope, with what the RDFa default user-agent conformance section says
(rather than leaving it wide open and hazy, like GRDDL; something that
probably can't be fixed now). And if you really want to go the whole
hog and throw everything at it...

Graph('http://example.org/', deep=True)

Or something like that.

> Does that meet your requirements?

In summary it's necessary but not sufficient. What I think I need, on
which of course I invite and welcome your insight, as I stated at the
top of this email is for RDFa to SHOULD a specific h:head/@profile
value, and to have the default RDFa user-agent conformance class to
say that user-agents MUST parse documents with the @profile, but only
MAY process other ones, according to some possibly future defined
metric for what Semantic Web agents should do in general.

As a consequence, the following section:

"A conforming RDFa Processor MUST make available to a consuming
application a single RDF [graph] containing all possible triples
generated by using the rules in the Processing Model section."
- http://www.w3.org/TR/rdfa-syntax/#uaconf

Is not good, I believe, for implementors.

> (Note that applying this logic would rule out the requirement for
> a specific DOCTYPE, which is why we need to double-check why
> the clause you refer to is in there.)

Thanks. The weird thing, of course, is that you're saying that
@profile being mandatory would be an unacceptable burden on producers
of RDFa documents (which, as I say, I somewhat agree with); but then
you're mandating the DOCTYPE! Clearly if the former holds, then the
DOCTYPE *must* be at least optional too.

Thanks for the quick and thoughtful response,

[1] http://lists.w3.org/Archives/Public/semantic-web/2007Nov/0104

-- 
Sean B. Palmer, http://inamidst.com/sbp/
Received on Thursday, 22 November 2007 11:53:50 UTC