Re: [ALL] RDF/A Primer for review from Gary Ng on 2006-01-16 (public-swbp-wg@w3.org from January 2006)

From: Gary Ng <Gary.Ng@cerebra.com>
Date: Mon, 16 Jan 2006 01:30:45 -0800
To: "SWBPD list" <public-swbp-wg@w3.org>
Message-ID: <D3824B3639761949B599477A08C6A0180119E535@wyoming.ad.networkinference.com>
>Guus and team,
>
>Happy New Year!
>
>Please find the latest version of the RDF/A Primer at:
>http://www.w3.org/2001/sw/BestPractices/HTML/2006-01-15-rdfa-primer
>
>This document *may* change in small ways before Monday's telecon, 
>but will remain stable for the week that follows the telecon. We're 
>looking forward to comments from our two reviewers and from anyone 
>else in the working group who has the time.
>

It is a nice piece of work with clear intentions and examples. The
principle of not duplicating content and embedding RDF content in a way
browsers can extract is clearly articulated. The proposal with
individual examples surrounding the photos and camera use case, plus
showing their RDF equivalent is very informative. 

My review can be divided into two main parts: 

    1) Organization of the Primer, and 
    2) Comments/Questions on the design of RDF/A itself.

Details below, 

Cheers, 

Gary



1) Overall Organization 
=======================

Would it be beneficial for the reader to have some brief introduction on
basic constructs, before diving into how they are used in the use case?
I found it difficult to follow the examples without first have an
overview understanding of (or knowing the boundary surrounding...) the
number of ways in which RDF can be specified, and using which
constructs.

For example, mid way through the doc, I found myself asking the
question: 

    - "How about annotating tables, frames, forms and dynamic content
from scripts producing menus and flash?"

    - "How do I create chains of triples?" For example, an address of a
person (Mark in the example), represented by an annoymous node, which in
turn has statements specifying triples making up the address. 

These were answered after checking the RDF/A Syntax [1]. In fact, the
primer could effortlessly include these concepts only with a little
introduction to the constructs.


Detail suggestions:

------------------------------

1.1) In the preliminaries, the following sentence may provide some
initial context to the reader. 

     "An XHTML document marked up with RDF/A constructs is a valid XHTML
Document. RDF/A is about using XHTML compatible constructs and
extensions to specify RDF 'content'. It is not about embedding RDF
syntax into XHTML documents."

------------------------------

1.2) With regards to the above questions I had while reading, I suggest
a small section right up front to introduce the basics, possible with
some simple examples from Section 3:

     "id" and "about" - These are equivalent to rdf:id and rdf:about.
They can appear as xml attributes in any XHTML constructs, including UL,
LI, DIV, BLOCKQUOTES, P ... etc. They essentially declare a rdf subject
for constructing RDF/A statements, either locally within one document,
or made reference-able from other documents in the case of "id". 

     "link" and "meta" - These are the main constructs to create rdf/a
statements. Link is used to create a relationship to another URI
resource, whereas meta is used to attach literal properties. These
constructs can specify its own subject using "about", or they take the
immediate parent XHTML element's "about" as subject. In the case where
the immediate parent does not have qualifying URI, the subject is an
anonymous rdf node. In the case where the immediate parent is a
link/meta element without an "about" URI, this statement reifies the
parent statement.

     "anchor" and "span" - These are alternative constructs to create
rdf/a statements. While anchor can be used instead of link, span can be
used instead of meta. Their difference to link and meta is that anchor
and span applies to an 'inherited' rdf subject. The nesting inheritance
is identical to how xmlns attribute is inherited within an XML document.
If the nesting chain does not contain a qualified subject, the document
itself is the subject. These constructs allow the RDF content to
somewhat follow the presentation of the content and thus avoid
duplication.

     Both meta and span each have two ways of specifying the associating
literal value. One is reusing what would also be displayed (the CDData
of the element):

     <[span|meta] property="dc:date" type="xsd:date">2006-01-02</span>

     The alternative is to use the 'content' attribute, where the value
is not the the CDData and thus it is not displayed as well as being
different to the CDData. 

     <[span|meta] property="dc:date" type="xsd:date"
content="2006-01-02">XYZ</span>

     In the latter case, if there is no CDData to display, this
effectively attaches a piece of RDF that does not have any presentation
consequence. This symmetry is also observed with link and anchor. 

------------------------------

1.3 Perhaps the primer should be arranged with a target reader in mind.
For example, to arrange from the point of view of an HTML author wanting
to find out how to add annotations to his/her docs, in the quickest time
possible.

Primer How-to:

A) say something about the Doc itself - 

  => essentially already in the examples within Section 3.
  . examples on link and meta, 
  . examples on span and anchor, 

B) declaring individual elements contained in a doc, and say something
about them: 

  . Adding an id, currently embedded within section 4.3
  . The use of about, currently embedded within section 4.2
  . Then the usual way like above (A) to add metadata. 
  . Refering back to an id within the same doc.
  . Refering an id in a different doc.

C) say something about external content that the author has no control
over

  => Currently 4.1
  . Annotating href links, 
  . Annotating opague objects: images, scripts, objects
  
D) Advanced Metadata 

  . using "link" and "meta" with unqualified XHTML elements, creating
chains of triples. 

------------------------------

1.4 Section 4.3 Qualifying chunks of document. 

     The title doesn't quite match the content here. The content is
about how to declare elements and metadata (of individual cameras on one
page) for other documents (photo album pages) to reference using ids. It
is still talking about annotating individual items (Cameras) in the
document, and not chunks of document as a whole. 

------------------------------


2) RDF/A itself.
=============================

I must say at first glance I found the approach extremely confusing. RDF
Mark up mixed with presentation markup such as <H1 property="dc:title">.
But I appreciate that there aren't that many choices to avoid
duplication of content, and to allow RDF markup within an orthogonal
presentation structure.

I have the following questions regarding the design itself.

------------------------------

2.1) Synchronization issue between metadata on a doc, versus the
metadata contained within that doc itself. 

    Images, files and other media will have their own metadata embedded
in the future. Certainly another html document will have its own
metadata. If RDF/A allows metadata to be added locally about a remote
URL, potentially the local metadata could be out of sync, or worse
contradict the metadata embedded within the resource itself?

------------------------------

2.2) Consistency

     I suspect there may already be an answer to this: Why are we not
using rdf prefixed attributes for RDF/A elements/attributes? rdf:id?
rdf:about? rdf:property, rdf:resource, rdf:description? This relates to
Pat's [2] comments about future migration from RDF to XHTML too.

------------------------------

2.3)  How about inheriting metadata through nested elements?

> In [2] Pat Hayes wrote:
>
> Also, giving an id to a whole RDF (sub)graph fits naturally 
> with the 'named graph' idea, unlike giving an id to every triple.
> 

This is interesting and would qualify as "Qualifying chunks of
document". For example, using some special non-presentational XHTML
elements to "group" metadata together?

------------------------------

2.4) The <img> element not allowing child elements makes the whole RDF/A
approach rather uneven. Is <img> the only XHTML element that does not
allow child element? could XHTML2 be changed to allow these meta data to
be the solely allowed child elements?

    <li> <img src="/user/markb/photo/23456" />,
      <span about="/user/markb/photo/23456" property="dc:title">
        Sunset in Nice
      </span>
    </li>
  
    Why don't we use the same approach instead of using <span>?

    <img src="/user/markb/photo/23456" property="dc:title">
      Sunset in Nice
    </img>

    of ocurse this now the subject is src="". But we can still make this
work to say for img, the "about" is the "src" attribute. See 2.5 below.

------------------------------

2.5) Flexible subject/object referrals suggestion.

Motivation 1:

     One thing that RDF/A has not considered is the annotation of HTML
forms. Imagine sofware agents understanding the form semantically and
automagically carryout complex form filling (beyond username, passwords
and personal information) on behalf of the user. I believe forms'
annotations will be extremely important for the semantic web.

Motivation 2:

     The use of content, href, about, id, are ways to specify the
subject and the object/value of the rdf statements. I feel that they are
somewhat restrictive, especially when the author acknowledges that there
are still some unavoidable duplication of content.

     To further reduce duplication of URIs and literals, as well as to
cater for annotating HTML forms in the future, it would seem a more
flexible approach may be possible. 

     Assuming the subject and object of the rdf statement can be taken
from existing XHTML (or XML) element attributes, one can completely
avoid duplication by 'referring' to those attributes from another, for
example:

     . <img src="http://....." attrAsStmtSubject="src">

     . Normally the attrAsStmtSubject defaults to "about" and "id"

     . <a href="http://....." attrAsStmtObject="href">

     . Normally the attrAsStmtObject defaults to "href" and thus could
be unspecified.

     . Similarly attrAsStmtValue="content", attrAsStmtValue="CDData", or
any other attributes/text element.

      Although I have not worked out the details, but I believe these
three new attributes (attrAsStmtSubject and attrAsStmtValue/Object) are
compatible with RDF/A concepts, and I believe they will allow forms to
be annotated without much content duplication.


[1] http://www.w3.org/2001/sw/BestPractices/HTML/2005-rdfa-syntax
[2] http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0025.html
Received on Monday, 16 January 2006 09:29:50 UTC