Re: [ALL] RDF/A Primer for review - Response to Gary Ng's Comments from Ben Adida on 2006-01-19 (public-swbp-wg@w3.org from January 2006)

From: Ben Adida <ben@mit.edu>
Date: Thu, 19 Jan 2006 11:50:56 -0500
To: SWBPD list <public-swbp-wg@w3.org>
Cc: public-rdf-in-xhtml task force <public-rdf-in-xhtml-tf@w3.org>
Message-Id: <22495544-C439-4BBA-B757-B3A8E7A1FAFC@mit.edu>
Gary,

Thanks very much for your comments.

Please find the task force's responses below. Note that the responses  
from Section 1 (Overall Organization) are from me alone, while the  
responses to Section 2 (Design of RDF/A itself) are the result of  
task force discussions from this week's telecon. We did not have time  
to cover Section 1 in our telecon, so, as the primary author of the  
Primer, I take it upon myself to answer those organizational  
questions. The conceptual questions were discussed with everyone.


> It is a nice piece of work with clear intentions and examples. The
> principle of not duplicating content and embedding RDF content in a  
> way
> browsers can extract is clearly articulated. The proposal with
> individual examples surrounding the photos and camera use case, plus
> showing their RDF equivalent is very informative.

Thanks!

> 1) Overall Organization
> =======================
>
> Would it be beneficial for the reader to have some brief  
> introduction on
> basic constructs, before diving into how they are used in the use  
> case?
> I found it difficult to follow the examples without first have an
> overview understanding of (or knowing the boundary surrounding...) the
> number of ways in which RDF can be specified, and using which
> constructs.
>
> For example, mid way through the doc, I found myself asking the
> question:
>
>     - "How about annotating tables, frames, forms and dynamic content
> from scripts producing menus and flash?"
>
>     - "How do I create chains of triples?" For example, an address  
> of a
> person (Mark in the example), represented by an annoymous node,  
> which in
> turn has statements specifying triples making up the address.
>
> These were answered after checking the RDF/A Syntax [1]. In fact, the
> primer could effortlessly include these concepts only with a little
> introduction to the constructs.


We wrestled with this a bit, and we chose to keep the RDF/A Primer  
short and example-focused, leaving the syntax description to the RDF/ 
A Syntax Document. The RDF/A Primer is definitely not meant to be  
complete, but rather to give a taste of what RDF/A can do for you. If  
the Primer raised questions that led you to the syntax document, then  
that is a successful Primer, in my opinion.

Mark Birbeck is working on a set of even simpler examples to target  
the blogging community. These would help introduce simple metadata  
for HTML authors, before we even bring in RDF triples. I'll talk more  
about this at Monday's telecon.

> 1.1) In the preliminaries, the following sentence may provide some
> initial context to the reader.
>
>      "An XHTML document marked up with RDF/A constructs is a valid  
> XHTML
> Document. RDF/A is about using XHTML compatible constructs and
> extensions to specify RDF 'content'. It is not about embedding RDF
> syntax into XHTML documents."

Good suggestion.  We'll work this into the Primer.

> 1.2) With regards to the above questions I had while reading, I  
> suggest
> a small section right up front to introduce the basics, possible with
> some simple examples from Section 3:
>
>      "id" and "about" - These are equivalent to rdf:id and rdf:about.
> They can appear as xml attributes in any XHTML constructs,  
> including UL,
> LI, DIV, BLOCKQUOTES, P ... etc. They essentially declare a rdf  
> subject
> for constructing RDF/A statements, either locally within one document,
> or made reference-able from other documents in the case of "id".
>
>      "link" and "meta" - These are the main constructs to create rdf/a
> statements. Link is used to create a relationship to another URI
> resource, whereas meta is used to attach literal properties. These
> constructs can specify its own subject using "about", or they take the
> immediate parent XHTML element's "about" as subject. In the case where
> the immediate parent does not have qualifying URI, the subject is an
> anonymous rdf node. In the case where the immediate parent is a
> link/meta element without an "about" URI, this statement reifies the
> parent statement.
>
>      "anchor" and "span" - These are alternative constructs to create
> rdf/a statements. While anchor can be used instead of link, span  
> can be
> used instead of meta. Their difference to link and meta is that anchor
> and span applies to an 'inherited' rdf subject. The nesting  
> inheritance
> is identical to how xmlns attribute is inherited within an XML  
> document.
> If the nesting chain does not contain a qualified subject, the  
> document
> itself is the subject. These constructs allow the RDF content to
> somewhat follow the presentation of the content and thus avoid
> duplication.
>
>      Both meta and span each have two ways of specifying the  
> associating
> literal value. One is reusing what would also be displayed (the CDData
> of the element):
>
>      <[span|meta] property="dc:date" type="xsd:date">2006-01-02</span>
>
>      The alternative is to use the 'content' attribute, where the  
> value
> is not the the CDData and thus it is not displayed as well as being
> different to the CDData.
>
>      <[span|meta] property="dc:date" type="xsd:date"
> content="2006-01-02">XYZ</span>
>
>      In the latter case, if there is no CDData to display, this
> effectively attaches a piece of RDF that does not have any  
> presentation
> consequence. This symmetry is also observed with link and anchor.

This is very useful text, but it seems much more appropriate for the  
RDF/A Syntax document. The Primer's role is really to introduce RDF/A  
to an HTML audience that isn't expected to know much about RDF in the  
first place. Jumping into a description of all the RDF concepts up  
front seems a bit much for a Primer.

Again, I do think this is useful for the Syntax document, though.

> 1.3 Perhaps the primer should be arranged with a target reader in  
> mind.
> For example, to arrange from the point of view of an HTML author  
> wanting
> to find out how to add annotations to his/her docs, in the quickest  
> time
> possible.
>
> Primer How-to:
>
> A) say something about the Doc itself -
>
>   => essentially already in the examples within Section 3.
>   . examples on link and meta,
>   . examples on span and anchor,
>
> B) declaring individual elements contained in a doc, and say something
> about them:
>
>   . Adding an id, currently embedded within section 4.3
>   . The use of about, currently embedded within section 4.2
>   . Then the usual way like above (A) to add metadata.
>   . Refering back to an id within the same doc.
>   . Refering an id in a different doc.
>
> C) say something about external content that the author has no control
> over
>
>   => Currently 4.1
>   . Annotating href links,
>   . Annotating opague objects: images, scripts, objects
>
> D) Advanced Metadata
>
>   . using "link" and "meta" with unqualified XHTML elements, creating
> chains of triples.

Yes, this is exactly what we're trying to do with the added examples  
that Mark is developing. The only difference is that we're going to  
stay away from talking too much about RDF graphs, and rather gently  
guide the HTML author from adding simple properties to adding more  
complex RDF statements.

> 1.4 Section 4.3 Qualifying chunks of document.
>
>      The title doesn't quite match the content here. The content is
> about how to declare elements and metadata (of individual cameras  
> on one
> page) for other documents (photo album pages) to reference using  
> ids. It
> is still talking about annotating individual items (Cameras) in the
> document, and not chunks of document as a whole.

A good point. Again, I wonder how much HTML authors will really  
differentiate here, but the language should be clear nevertheless.  
I'll work on this.


> 2) RDF/A itself.
> =============================
>
> I must say at first glance I found the approach extremely  
> confusing. RDF
> Mark up mixed with presentation markup such as <H1  
> property="dc:title">.
> But I appreciate that there aren't that many choices to avoid
> duplication of content, and to allow RDF markup within an orthogonal
> presentation structure.

Yes, there is bound to be some confusion at first. We're certainly  
trying to minimize it - thus the limited scope of this primer.  
Hopefully, by the time you finished reading the document, you were  
less confused. But let us know if there are additional things we can  
do (beyond your comments here) to reduce this confusion.

> 2.1) Synchronization issue between metadata on a doc, versus the
> metadata contained within that doc itself.
>
>     Images, files and other media will have their own metadata  
> embedded
> in the future. Certainly another html document will have its own
> metadata. If RDF/A allows metadata to be added locally about a remote
> URL, potentially the local metadata could be out of sync, or worse
> contradict the metadata embedded within the resource itself?

This is indeed an issue of concern, though it appears to be one that  
applies to all RDF serializations, including RDF/XML. Methods for  
resolving such inconsistencies should be devised at a general RDF level.

> 2.2) Consistency
>
>      I suspect there may already be an answer to this: Why are we not
> using rdf prefixed attributes for RDF/A elements/attributes? rdf:id?
> rdf:about? rdf:property, rdf:resource, rdf:description? This  
> relates to
> Pat's [2] comments about future migration from RDF to XHTML too.

The most important point here is that the task force tried hard to  
use RDF/XML syntax for RDF/A, but this failed because of RDF/XML's  
striped syntax. Note also that reusing existing HTML attributes turns  
out to make for a very good migration path for HTML authors (who  
constitute the main target of this work.)

As per our response to Pat's comments, the right way to migrate large  
chunks of RDF/XML into HTML is to use a <link rel="meta"> element.  
The hard part of the migration requires determining which rendered  
data corresponds to which RDF property, and no amount of syntax can  
help there: it's a semantic merging operation.

> 2.3)  How about inheriting metadata through nested elements?
>
> > In [2] Pat Hayes wrote:
> >
> > Also, giving an id to a whole RDF (sub)graph fits naturally
> > with the 'named graph' idea, unlike giving an id to every triple.
> >
>
> This is interesting and would qualify as "Qualifying chunks of
> document". For example, using some special non-presentational XHTML
> elements to "group" metadata together?

There may be a misunderstanding here. There *is* nesting in RDF/A:  
you can inherit the about attribute as far up/down the DOM hierarchy  
as you'd like. Is that what you're after?

> 2.4) The <img> element not allowing child elements makes the whole  
> RDF/A
> approach rather uneven. Is <img> the only XHTML element that does not
> allow child element? could XHTML2 be changed to allow these meta  
> data to
> be the solely allowed child elements?
>
>     <li> <img src="/user/markb/photo/23456" />,
>       <span about="/user/markb/photo/23456" property="dc:title">
>         Sunset in Nice
>       </span>
>     </li>
>
>     Why don't we use the same approach instead of using <span>?
>
>     <img src="/user/markb/photo/23456" property="dc:title">
>       Sunset in Nice
>     </img>
>
>     of ocurse this now the subject is src="". But we can still make  
> this
> work to say for img, the "about" is the "src" attribute. See 2.5  
> below.

This turns out to be one of our outstanding issues that we are still  
finalizing. [1]

We are currently leaning towards the syntax you mention, where the  
content of an image element could include metadata about that image  
and the SRC attribute would be the subject. Steven is checking with  
the XHTML working group to ensure that this does not cause any  
unforeseen complications. However, what's important to note is that,  
even if this syntax is adopted, the "Sunset in Nice" text in your  
above example would only be rendered in a browser if there is a  
failure to load the image.

This seems consistent with the fact that the image is really an  
external resource, and any internal HTML element value should really  
be considered an ALT tag from the point of view of rendering. Note  
that the same would apply to OBJECT elements.

> 2.5) Flexible subject/object referrals suggestion.
>
> Motivation 1:
>
>      One thing that RDF/A has not considered is the annotation of HTML
> forms. Imagine sofware agents understanding the form semantically and
> automagically carryout complex form filling (beyond username,  
> passwords
> and personal information) on behalf of the user. I believe forms'
> annotations will be extremely important for the semantic web.

Forms annotation is indeed important, and is already possible with  
the current RDF/A. Remember that any XHTML element can be annotated.  
What we should do is add an example in the primer to show how this  
can be done, something along the lines of (this is XHTML1, just to  
explain the principle):

======
<form method="post" action="/foobar">
    <meta property="dc:description" content="Login Form" />

    <input type="text" name="username">
       <meta property="dc:title" content="username" />
    </input>
...

</form>
=======

With proper annotations, browsers could become much smarter about  
what they do with these forms, as you mention.


> Motivation 2:
>
>      The use of content, href, about, id, are ways to specify the
> subject and the object/value of the rdf statements. I feel that  
> they are
> somewhat restrictive, especially when the author acknowledges that  
> there
> are still some unavoidable duplication of content.
>
>      To further reduce duplication of URIs and literals, as well as to
> cater for annotating HTML forms in the future, it would seem a more
> flexible approach may be possible.
>
>      Assuming the subject and object of the rdf statement can be taken
> from existing XHTML (or XML) element attributes, one can completely
> avoid duplication by 'referring' to those attributes from another, for
> example:
>
>      . <img src="http://....." attrAsStmtSubject="src">
>
>      . Normally the attrAsStmtSubject defaults to "about" and "id"
>
>      . <a href="http://....." attrAsStmtObject="href">
>
>      . Normally the attrAsStmtObject defaults to "href" and thus could
> be unspecified.
>
>      . Similarly attrAsStmtValue="content",  
> attrAsStmtValue="CDData", or
> any other attributes/text element.
>
>       Although I have not worked out the details, but I believe these
> three new attributes (attrAsStmtSubject and attrAsStmtValue/Object)  
> are
> compatible with RDF/A concepts, and I believe they will allow forms to
> be annotated without much content duplication.

The task force feels that much of the motivations for these changes  
could be accomplished without any additional complexity (see form  
annotation above). Certainly, your suggestion would further reduce  
data duplication, but only with significant added complexity in RDF/ 
A. Extracting triples would become far more complicated, as the  
values of certain attributes would affect the actual parsing of the  
rest of the document. Thus, at this point, we would not want to adopt  
this recommendation.


Thanks for some very useful and insightful comments. Please let us  
know if these answers give rise to new questions.

-Ben Adida
ben@mit.edu

[1] http://www.w3.org/2001/sw/BestPractices/HTML/2005-current-issues#src
Received on Thursday, 19 January 2006 16:51:15 UTC