Re: [RDFa] ISSUE-28: following your nose to the RDFa specification from Ben Adida on 2007-07-10 (public-swd-wg@w3.org from July 2007)

From: Ben Adida <ben@adida.net>
Date: Tue, 10 Jul 2007 16:07:28 -0700
To: Keith Alexander <k.j.w.alexander@gmail.com>
CC: mark.birbeck@x-port.net, Dan Connolly <connolly@w3.org>, RDFa <public-rdf-in-xhtml-tf@w3.org>, SWD WG <public-swd-wg@w3.org>
Message-ID: <46941130.2060205@adida.net>
Hi Keith,

You've dropped from the thread the most important point I made in my
last email, regarding the potential for a profile to *negate* the RDFa.
>From where I stand, I suspect this even invalidates the way GRDDL works.

Consider a world where XHTML is validated using XML Schema. GRDDL says
that I can have a schema-specified transformation. Now, whatever
@profile says in the instance document, it doesn't matter: the
schema-specified transformation is a valid transformation to run. Your
instance-based @profile cannot *negate* the schema-specified transformation.

RDFa is effectively a schema-specified transformation. You can choose to
perform additional transforms if you want (and I'd rather they be hGRDDL
to keep the locality principle alive), but you cannot *undo* the RDFa
transformation specified in the schema.

Now, if HTML is validated using DTDs... what difference should that
make? The instance document shouldn't be able to negate what the
specification says about the syntax and its implied semantics.

So that's the key thing: if you consider that @profile can negate
schema-expressed triples, I think you're contradicting the way GRDDL
works, regardless of RDFa. I don't think @profile can negate the meaning
of @about.

> I'd quibble here that the document isn't expressing this *triple*. It's
> expressing information which you might want to choose to express in any
> number of formats.

Ah, this is an argument DanC and I have all the time. RDF is an abstract
representation of data. If the author intends to say that the title of
the document is "Foo", then I can certainly express that intent as an
RDF triple, even if the author has no idea what RDF stands for. An
author doesn't have to be RDF-aware. Otherwise, the entire
microformat-GRDDL approach is dead in the water, as the majority of
microformat folks have no love (or likely knowledge) of triples.

> I don't think it is what you want to do with RDFa because you only want
> to express some of the semantics of the html spec. RDFa doesn't express
> the whole document as triples. You pick and choose. You're parser, for
> instance produces triples from @rel=next, but you don't from
> @rel=stylesheet for instance.

On the general principle: we pick and choose things that are semantic.
On @rel=stylesheet, that's actually a mistake in my parser, since that's
a reserved word and thus does denote some semantics.

> I don't see that a @profile instead of a DOCTYPE worsens the situation
> of copy-and-paste and self-containment. Neither can be placed within the
> <body> of the document. In fact, if anything, the DOCTYPE is far more
> fundamentally rooted to the top of the document. And as Dan said, far
> less accessible for scripting and XSLT.

Again, this is where you've conveniently stashed away your previous
argument, where you said you wanted to use RDFa attributes for other
purposes. *That* is what @profile-centricity would do: it would make
these attributes potentially mean something *different*, which is truly
awful.

There are plenty of reasons to take stuff that was never meant to be
semantic and provide many ways to transform that into RDF, which is what
GRDDL intend. But when you specifically introduce attributes for
semantics, leaving their meaning up in the air is a disaster.

> I don't think you've really addressed my earlier point. <link rel="next"
> href="/next.html" /> is the DOM.

No, it's *not* just the DOM. The HTML specification says that this has a
special meaning, and that browsers can take that meaning and do
something with it if they choose, like offer a Table of Contents UI.

> Neither GRDDL nor RDFa is, or should
> be, about expressing the entire DOM as triples. There's lots of
> information within the document structure that RDFa chooses not to
> express as triples - this doesn't negate their meaning or presence in
> the DOM.

You're confusing things. rel="next" is more than the DOM. foo="bar",
that's DOM, rel="next", that's DOM + HTML-specified semantics. In RDFa,
we're trying to expand on the HTML-specified semantics in a consistent,
backwards-compatible way, because we think that's very valuable. We're
not just adding attributes that can then be interpreted willy-nilly by
any profile the author chooses.

> On the contrary as far as I can see.
> If you make RDFa consistent with GRDDL
> 
> * every GRDDL agent understands RDFa
> * all fully-compliant RDFa documents will be understood by all GRDDL
> agents (as with DOCTYPE, it is likely that some documents will contain
> RDFa, but lack the @profile - individual software and people can make
> their own decisions on what to do in these cases, but cannot hold the
> publisher to the triples they generate as a result).
> * Authors can check and test their documents by running them through any
> compliant GRDDL agent
> * Consumers can be sure what triples were intended by the author of any
> document with a @profile
> * You have a clear, consistent mechanism in place for adding RDFa to
> other types of XML document
> * It is easy to check for RDFa with XPath and javascript, both in HTML,
> and in arbitrary XML.
> * Properly compliant cut'n'pasting requires an RDFa profile in source
> and target documents.
> 
> If you make RDFa inconsistent with GRDDL:
> 
> * some RDFa-compliant documents will be GRDDL-compliant, some won't.
> * some fully-compliant RDFa documents won't be understood by GRDDL agents
> * HTML authors using GRDDL have to run their documents through two
> parsers and combine the results to check which triples they are publishing.
> * There is no clear idea of what authors in other XML varieties will
> have to do yet.
> * Consumers of HTML have to check both the DOCTYPE and the @profile to
> see how to get all the intended triples.
> * Some tool developers will find it hard/impossible to check for the
> DOCTYPE, so will be forced to  either assume all documents are RDFa, or
> none are, regardless of the author's intentions.
> * Properly compliant cut'n'pasting requires an RDFa profile in source
> and target documents.
> 
>> I definitely don't want to encourage folks to use the RDFa
>> attributes for non-RDFa purposes, since that hurts the
>> self-containment/copy-and-paste goal significantly.
>>
> It doesn't. You only have real fully-compliant, follow-your nose cut and
> paste if the cutting and pasting is done from one RDFa document to
> another. This applies whatever official signifier you use to denote RDFa
> triples rules.
> 
> 
>  Firstly, the most likely scenario for reusing RDF attributes is authors
> needing to extend RDFa, rather than pervert it. How often people need to
> extend RDFa depends on how good the spec is I suppose; but wouldn't you
> rather they were able to do so and remain standards-compliant? I think
> you said yourself Ben (though it could have been someone else), with
> regards HTML5, that people will do what they need/want to do anyway, so
> it is better for the spec to plan for extensibility than force people
> into defiance of it.
> 
> Secondly, RDFa also infers triples from existing elements and
> attributes. Your js RDFa parser Ben, generates some odd triples from my
> eRDF documents
> 
>>> If my page does have the RDFa profile on the other hand, your tool can
>>> extract triples according to the rules of RDFa, do clever things with
>>> the html context of the triples, etc.
>>
>> Right, so again you're asking for a PROFILE centric approach to this.
>> For reasons mentioned above, I don't think that's good. It takes us down
>> the same broken path of microformats: no consistent syntax, ever.
> 
> That's not true at all. The most important thing is to be able to get
> triples. RDF doesn't need to have only one syntax for this. RDFa is an
> offical syntax for providing both triples, and the DOM context of those
> triples. That may be a good option for authors to have, but it doesn't
> need to be forced on authors whether they like it or not - that would
> be  a bad thing.
> 
> Yours,
> 
> Keith
> 
>
Received on Tuesday, 10 July 2007 23:07:34 UTC