Re: Reviewing Last Call RDFa

At 08:54 AM 6/18/2008 -0400, Tim Berners-Lee wrote:
>>>** Deployment path and architecture:
>>>In general, is the deployment path for this spec that (a) it  
>>>introduce  new attributes into the HTML namespace, and that  
>>>conforming RDF-aware  HTML clients be expected in future to  
>>>understand RDFa, or is it that  the GRDDL transform link from (b)  
>>>the document or from (c) the HTML  schema?
>>
>>We are adding attributes to the XHTML1 namespace, and we expect RDFa- aware clients to understand these as RDF.
>
>Ah. You chose path (a). In that case you should not use (b) as it is a  
>burden on the writer.  It also gives the reader the mistaken  
>impression that RDFa can be understood just by implementing GRDDL.

(b) gives the reader the clear instruction that GRDDL _can_ be used
to implement RDFa.  As you know, we are also (proposing to) add
the GRDDL transform link from the XHTML namespace document.
That makes it unnecessary to add @profile to the document instance,
however we are not confident that all GRDDL processors currently
follow the namespace document instructions so we wish to make
it clear to document authors that the GRDDL @profile is also proper.

I personally believe that it would be better to change SHOULD to
MAY for both the @version attribution and the @profile attribute in
4.1 Document Conformance.  We left those as SHOULD because
of the long thread with certain participants in the TAG who seemed
concerned that the intent of document authors could still be
interpreted vaguely.

>>We will also be adding a GRDDL link in the XHTML namespace. We've  
>>left the "SHOULD" on the @profile, because we've been told that a  
>>number of GRDDL clients don't do namespace-based transformations (we  
>>haven't confirmed this, we're just trying to be accommodating for  
>>folks who want to choose this route.)
>
>I think you are missing the point of the specification.  It is to  
>ensure communication. It is to ensure that when both parties  
>understand a given set of specs, then a precise set of triples is  
>communicated between them.

If saying SHOULD for *any* of the 3 document conformance
options 4, 5, or 6 leaves the reader with the impression that NOT
doing any of the three means the author has NOT intended to
state these triples then I would agree with your concern.
However, these SHOULDs are there for 3 different reasons.  (The
@version reason being IMHO the weakest of the 3 justifications.)

You might be telling us that the spec makes it unclear to the
developer of a tool that consumes XHTML documents whether
or not to rely on the presence of DOCTYPE, @version, or @profile
when determining if the document has triples.   Perhaps this
would be cleared up if the first 3 conformance points used
uppercase MUST rather than lowercase must.  That is,
conformance criteria 2 and 3 in particular declare that the
document is an XHTML1 document and all XHTML1 documents
have these triples.

>Here is your argument rephrased:   It is true that some readers will  
>not understand GRDDL, but that is OK as GRDDL is only a SHOULD.  Eh?

No, that is not what the WG is saying at all.  The WG is saying that
*some* current GRDDL implementations are believed to not derive
the transform from the namespace document and therefore authors
SHOULD (for now) include @profile if they want to be certain that
more deployed GRDDL tools will recognize these documents.

I would support making SHOULD ... @profile be a "feature at risk"
where the WG reverts to "MAY" at PR based on further implementation
reports.

>The requirement is that ALL conforming clients understand ALL  
>conforming servers.  If some clients understand GRDDL and some don't  
>and some servers put the GRDDL profile in and some don't, then some  
>client-server pairs will fail.

I am uncertain what you mean here by "client" and "server".  A tool
that uses the triples contained in an XHTML+RDFa document
is not required to be a GRDDL processor nor to understand GRDDL.
GRDDL processing is OPTIONAL for RDFa conformance.  The spec
intends to make the triples accessible to those processors who
do understand GRDDL.

>You could say, (1) "All servers MUST put the document GRDDL, and  
>clients MAY use document GRDDL, or may use inherent knowledge of the  
>spec." That would work in all cases.  But you don't want to as it is a  
>pain for the server.

Just as it is a pain to have to put in DOCTYPE but some use cases
have DOCTYPE as a requirement so we tell document authors that
there are cases where they SHOULD put in the DOCTYPE (without
specifying what those cases are, as they are implementation-specif.)

>You could say, (2) "All servers MUST put the namespace GRDDL, and  
>clients MAY use namespace GRDDL, or may use inherent knowledge of the  
>spec." That would work in all cases.

All servers don't need to touch the namespace.  Only the
authoritative XHTML1 namespace document server(s) need(s)
to do that.  That is us (w3.org).  We're proposing (via this CR) to do it.

>You could say, (3) "All clients must have an inherent knowledge of the  
>spec" and make GRDDL nothing to do with conformance, and that would  
>work too.  But it would mean that a whole set of possible low-hanging  
>implementations which are existing namespace-driven GRDDL  
>implementations would not work, which would be a shame.

a shame, exactly.

>So I suspect you want go with (2).  To define RDFa conformance. 

That's what we've done.  Perhaps we need to clarify further those
cases where DOCTYPE and @profile MAY be useful to a document
author.  Unfortunately (from my POV) those are implementation
issues rather than specification issues and  I wish that specs could
be more pure.

>Obviously, people might want to make documents in the short term which  
>work equally well by conforming to the GRDDL spec (document profile  
>method) and by RDFa but that is a distraction.

It's a very real distraction for now.  My interpretation is that the Working
Groups' belief is that this version of the specification should give
document authors the practical advice.

>>The Follow-Your-Nose architecture principle is fulfilled by the  
>>XHTML namespace document, which we are updating via the XHTML2 WG,  
>>and the definition of XHTML1.1+RDFa. The GRDDL pointers are there  
>>for convenience, but may be considered redundant.
>
>Eh?  The GRDDL spec should define where a nose-following client looks  
>for GRDDL pointers and where it doesn't.  Again, putting pointers  
>places where they are not actually read is just confusing.  if the  
>GRDDL spec is not clear on the algorithm, then it must be cleaned up.

I believe we were all -- including you -- clear that the GRDDL spec
says the namespace document is sufficient to establish a transform.

>Assuming (2) above is used, then the namespace document must have a  
>pointer to the transform ina way that os followed by a conforming  
>GRDDL client which knows nothing a priori about RDFa.

The Working Groups agree and it is our/their intention to add this
to the XHTML1 namespace document.  You and I had discussed this
at considerable length and I believed there was no longer any confusion
on this matter.  At least, that is how I communicated it to the WGs.

>>>** Purpose of the profile
...
>>how do I know that an RDFa  
>reader will not extract triples from a pre-RDFa HTML document that  
>were not intended by the author?

This spec does not address HTML documents at all.

This spec says that XHTML documents can now be understood
(by GRDDL processors too) to contain RDF triples that are
semantically what the XHTML2 Working Group says all XHTML1
documents already contain.  *In addition*, authors who put
@about, @property, @resource, and @typeof in their documents
are including RDF triples that extend the previously available
expressivity of XHTML1.  Authors who may have decided to use
these previously unspecified (by XHTML1) @attributes and
claimed that those old documents were still XHTML1 documents
will be surprised. We have not provided any such authors any
additional hook (other than those previously provided by
XHTML1) by which to  say 'no, wait, I lied; those really are not
XHTML1 documents'.

>>>** DTDs
>>>4.1   "There SHOULD be a DOCTYPE declaration in the document prior  
>>>to  the root element."  DTDs are a an obsolete technology. Suggest  
>>>the  spec not refer to them in any way.
>>
>>You mentioned that the W3C is working on a DTD-free method of  
>>validation. Hopefully that will be ready in time for RDFa 1.1. For  
>>now, though, we have no choice but to refer to them in order to help  
>>our users put a "valid according to W3C" logo on their pages.
>
>This is completely backwards.  It is the tail wagging the dog.  The  
>validator is a *service* provided to *support* the recommendations,  
>not the other way around.   It is ridiculous to refuse to introduce  
>new technology because it won't validate as old technology.  The  
>validator should be fixed to follow new standards, and until it has  
>been its output should be disregarded.

Tim, with all due respect, this is a matter of internal W3C policy
right now.   As you know our internal policies put certain constraints
on the W3C Webmaster to accept or refuse a Technical Report based
on a go/no-go from certain tools.

The RDFa specification makes available a DTD for any and all
such customers who -- for their own reasons -- feel a requirement
for DTD-based validation.  The use of SHOULD rather than MAY
is a source of concern.  The W3C Webmaster has made it
abundantly clear that the Working Group cannot publish without
including a DOCTYPE containing this DTD.  That -- in the WG's
opinion -- makes this a SHOULD.  Other XHTML+RDFa document
authors MAY have less concern for what the W3C Webmaster
imposes and NEED NOT include a DOCTYPE.

>If we put together a new team to build a new validator, it would be  
>useful of working groups involved could find volunteers to contribute  
>to the team.  I know this has been a long-standing resource issue, but  
>it is important.

And out of scope for the RDFa effort.

>>>4.3  "A conforming RDFa Processor MAY make available additional   
>>>triples that have been generated using rules not described here,  
>>>but  these triples MUST NOT be made available in the [default  
>>>graph]."
>>>I feel this is an inherently weak method of specifying the meaning  
>>>of  an RDFa document.
>>
>>We only mean this to enable RDFa processors to also process  
>>microformats, if they so choose.
>
>Ah I see.  "Default graph" -- the meaning of the document.  if someone  
>makes some RDFb spec, can it not add more triples still?

Other specs (or some future RDFa spec) are free to declare
semantics for document(s) beyond what RDFa 1.0 specifies,
certainly.  They may choose to put those triples into the "default
graph" or not, as they feel appropriate.

I see that this MUST NOT language could be interpreted as
creating a conflict preventing a single piece of code from
conforming to both RDFa and some other specification.
That was not the Working Groups' intent.  Perhaps the last
part of that sentence might be clearer if rephrased as
"these triples are not authorized by this specification to be
included in the [default graph]."

>>We felt that not leaving this door open might lead some folks to  
>>interpret RDFa as ruling out an RDF interpretation for microformats,  
>>which is not our intention.
>
>
>good
>
>>We could not find a cleaner way to phrase it without making the spec  
>>much more complicated.
>
>well, just writing that explanation helped me -- maybe it could go in  
>the spec informationally.

That's the sort of 'complicatedness' that has cost the Working Groups
considerable debate time.

>>At the same time, we feel that the specification is quite strict and  
>>thus strong about the [default graph], which is the whole enchilada  
>>as far as RDFa is concerned.
>
>Why "default"?  Are there options to change it?  Does it relate to  
>SPARQL in some way (which has a concept of default graph)  i think the  
>important thing is that the RDFa-derived graph is seen as being  
>asserted by the document, but other things can also be. I think we  
>agree on that.

I believe the Working Groups agree that we do not wish this
specification to rule out the possibility of other specifications
adding to the assertions made by a document.

>  I don't think the text in the spec conveyed it.

does the "... not authorized by this specification ..." rephrasing
help enough?

Received on Wednesday, 18 June 2008 16:12:07 UTC