Re: [ALL] RDF/A Primer for review - Response to Gary Ng's Comments from Guus Schreiber on 2006-01-23 (public-swbp-wg@w3.org from January 2006)

From: Guus Schreiber <schreiber@cs.vu.nl>
Date: Mon, 23 Jan 2006 01:32:18 +0100
To: Ben Adida <ben@mit.edu>
CC: SWBPD list <public-swbp-wg@w3.org>, public-rdf-in-xhtml task force <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <43D42412.6030803@cs.vu.nl>
Ben,

Areyou in a position to propose the draft for WG publication?

Best,
Guus


Ben Adida wrote:
> 
> 
> Gary,
> 
> Thanks very much for your comments.
> 
> Please find the task force's responses below. Note that the responses  
> from Section 1 (Overall Organization) are from me alone, while the  
> responses to Section 2 (Design of RDF/A itself) are the result of  task 
> force discussions from this week's telecon. We did not have time  to 
> cover Section 1 in our telecon, so, as the primary author of the  
> Primer, I take it upon myself to answer those organizational  questions. 
> The conceptual questions were discussed with everyone.
> 
> 
>> It is a nice piece of work with clear intentions and examples. The
>> principle of not duplicating content and embedding RDF content in a  way
>> browsers can extract is clearly articulated. The proposal with
>> individual examples surrounding the photos and camera use case, plus
>> showing their RDF equivalent is very informative.
> 
> 
> Thanks!
> 
>> 1) Overall Organization
>> =======================
>>
>> Would it be beneficial for the reader to have some brief  introduction on
>> basic constructs, before diving into how they are used in the use  case?
>> I found it difficult to follow the examples without first have an
>> overview understanding of (or knowing the boundary surrounding...) the
>> number of ways in which RDF can be specified, and using which
>> constructs.
>>
>> For example, mid way through the doc, I found myself asking the
>> question:
>>
>>     - "How about annotating tables, frames, forms and dynamic content
>> from scripts producing menus and flash?"
>>
>>     - "How do I create chains of triples?" For example, an address  of a
>> person (Mark in the example), represented by an annoymous node,  which in
>> turn has statements specifying triples making up the address.
>>
>> These were answered after checking the RDF/A Syntax [1]. In fact, the
>> primer could effortlessly include these concepts only with a little
>> introduction to the constructs.
> 
> 
> 
> We wrestled with this a bit, and we chose to keep the RDF/A Primer  
> short and example-focused, leaving the syntax description to the RDF/ A 
> Syntax Document. The RDF/A Primer is definitely not meant to be  
> complete, but rather to give a taste of what RDF/A can do for you. If  
> the Primer raised questions that led you to the syntax document, then  
> that is a successful Primer, in my opinion.
> 
> Mark Birbeck is working on a set of even simpler examples to target  the 
> blogging community. These would help introduce simple metadata  for HTML 
> authors, before we even bring in RDF triples. I'll talk more  about this 
> at Monday's telecon.
> 
>> 1.1) In the preliminaries, the following sentence may provide some
>> initial context to the reader.
>>
>>      "An XHTML document marked up with RDF/A constructs is a valid  XHTML
>> Document. RDF/A is about using XHTML compatible constructs and
>> extensions to specify RDF 'content'. It is not about embedding RDF
>> syntax into XHTML documents."
> 
> 
> Good suggestion.  We'll work this into the Primer.
> 
>> 1.2) With regards to the above questions I had while reading, I  suggest
>> a small section right up front to introduce the basics, possible with
>> some simple examples from Section 3:
>>
>>      "id" and "about" - These are equivalent to rdf:id and rdf:about.
>> They can appear as xml attributes in any XHTML constructs,  including UL,
>> LI, DIV, BLOCKQUOTES, P ... etc. They essentially declare a rdf  subject
>> for constructing RDF/A statements, either locally within one document,
>> or made reference-able from other documents in the case of "id".
>>
>>      "link" and "meta" - These are the main constructs to create rdf/a
>> statements. Link is used to create a relationship to another URI
>> resource, whereas meta is used to attach literal properties. These
>> constructs can specify its own subject using "about", or they take the
>> immediate parent XHTML element's "about" as subject. In the case where
>> the immediate parent does not have qualifying URI, the subject is an
>> anonymous rdf node. In the case where the immediate parent is a
>> link/meta element without an "about" URI, this statement reifies the
>> parent statement.
>>
>>      "anchor" and "span" - These are alternative constructs to create
>> rdf/a statements. While anchor can be used instead of link, span  can be
>> used instead of meta. Their difference to link and meta is that anchor
>> and span applies to an 'inherited' rdf subject. The nesting  inheritance
>> is identical to how xmlns attribute is inherited within an XML  document.
>> If the nesting chain does not contain a qualified subject, the  document
>> itself is the subject. These constructs allow the RDF content to
>> somewhat follow the presentation of the content and thus avoid
>> duplication.
>>
>>      Both meta and span each have two ways of specifying the  associating
>> literal value. One is reusing what would also be displayed (the CDData
>> of the element):
>>
>>      <[span|meta] property="dc:date" type="xsd:date">2006-01-02</span>
>>
>>      The alternative is to use the 'content' attribute, where the  value
>> is not the the CDData and thus it is not displayed as well as being
>> different to the CDData.
>>
>>      <[span|meta] property="dc:date" type="xsd:date"
>> content="2006-01-02">XYZ</span>
>>
>>      In the latter case, if there is no CDData to display, this
>> effectively attaches a piece of RDF that does not have any  presentation
>> consequence. This symmetry is also observed with link and anchor.
> 
> 
> This is very useful text, but it seems much more appropriate for the  
> RDF/A Syntax document. The Primer's role is really to introduce RDF/A  
> to an HTML audience that isn't expected to know much about RDF in the  
> first place. Jumping into a description of all the RDF concepts up  
> front seems a bit much for a Primer.
> 
> Again, I do think this is useful for the Syntax document, though.
> 
>> 1.3 Perhaps the primer should be arranged with a target reader in  mind.
>> For example, to arrange from the point of view of an HTML author  wanting
>> to find out how to add annotations to his/her docs, in the quickest  time
>> possible.
>>
>> Primer How-to:
>>
>> A) say something about the Doc itself -
>>
>>   => essentially already in the examples within Section 3.
>>   . examples on link and meta,
>>   . examples on span and anchor,
>>
>> B) declaring individual elements contained in a doc, and say something
>> about them:
>>
>>   . Adding an id, currently embedded within section 4.3
>>   . The use of about, currently embedded within section 4.2
>>   . Then the usual way like above (A) to add metadata.
>>   . Refering back to an id within the same doc.
>>   . Refering an id in a different doc.
>>
>> C) say something about external content that the author has no control
>> over
>>
>>   => Currently 4.1
>>   . Annotating href links,
>>   . Annotating opague objects: images, scripts, objects
>>
>> D) Advanced Metadata
>>
>>   . using "link" and "meta" with unqualified XHTML elements, creating
>> chains of triples.
> 
> 
> Yes, this is exactly what we're trying to do with the added examples  
> that Mark is developing. The only difference is that we're going to  
> stay away from talking too much about RDF graphs, and rather gently  
> guide the HTML author from adding simple properties to adding more  
> complex RDF statements.
> 
>> 1.4 Section 4.3 Qualifying chunks of document.
>>
>>      The title doesn't quite match the content here. The content is
>> about how to declare elements and metadata (of individual cameras  on one
>> page) for other documents (photo album pages) to reference using  ids. It
>> is still talking about annotating individual items (Cameras) in the
>> document, and not chunks of document as a whole.
> 
> 
> A good point. Again, I wonder how much HTML authors will really  
> differentiate here, but the language should be clear nevertheless.  I'll 
> work on this.
> 
> 
>> 2) RDF/A itself.
>> =============================
>>
>> I must say at first glance I found the approach extremely  confusing. RDF
>> Mark up mixed with presentation markup such as <H1  property="dc:title">.
>> But I appreciate that there aren't that many choices to avoid
>> duplication of content, and to allow RDF markup within an orthogonal
>> presentation structure.
> 
> 
> Yes, there is bound to be some confusion at first. We're certainly  
> trying to minimize it - thus the limited scope of this primer.  
> Hopefully, by the time you finished reading the document, you were  less 
> confused. But let us know if there are additional things we can  do 
> (beyond your comments here) to reduce this confusion.
> 
>> 2.1) Synchronization issue between metadata on a doc, versus the
>> metadata contained within that doc itself.
>>
>>     Images, files and other media will have their own metadata  embedded
>> in the future. Certainly another html document will have its own
>> metadata. If RDF/A allows metadata to be added locally about a remote
>> URL, potentially the local metadata could be out of sync, or worse
>> contradict the metadata embedded within the resource itself?
> 
> 
> This is indeed an issue of concern, though it appears to be one that  
> applies to all RDF serializations, including RDF/XML. Methods for  
> resolving such inconsistencies should be devised at a general RDF level.
> 
>> 2.2) Consistency
>>
>>      I suspect there may already be an answer to this: Why are we not
>> using rdf prefixed attributes for RDF/A elements/attributes? rdf:id?
>> rdf:about? rdf:property, rdf:resource, rdf:description? This  relates to
>> Pat's [2] comments about future migration from RDF to XHTML too.
> 
> 
> The most important point here is that the task force tried hard to  use 
> RDF/XML syntax for RDF/A, but this failed because of RDF/XML's  striped 
> syntax. Note also that reusing existing HTML attributes turns  out to 
> make for a very good migration path for HTML authors (who  constitute 
> the main target of this work.)
> 
> As per our response to Pat's comments, the right way to migrate large  
> chunks of RDF/XML into HTML is to use a <link rel="meta"> element.  The 
> hard part of the migration requires determining which rendered  data 
> corresponds to which RDF property, and no amount of syntax can  help 
> there: it's a semantic merging operation.
> 
>> 2.3)  How about inheriting metadata through nested elements?
>>
>> > In [2] Pat Hayes wrote:
>> >
>> > Also, giving an id to a whole RDF (sub)graph fits naturally
>> > with the 'named graph' idea, unlike giving an id to every triple.
>> >
>>
>> This is interesting and would qualify as "Qualifying chunks of
>> document". For example, using some special non-presentational XHTML
>> elements to "group" metadata together?
> 
> 
> There may be a misunderstanding here. There *is* nesting in RDF/A:  you 
> can inherit the about attribute as far up/down the DOM hierarchy  as 
> you'd like. Is that what you're after?
> 
>> 2.4) The <img> element not allowing child elements makes the whole  RDF/A
>> approach rather uneven. Is <img> the only XHTML element that does not
>> allow child element? could XHTML2 be changed to allow these meta  data to
>> be the solely allowed child elements?
>>
>>     <li> <img src="/user/markb/photo/23456" />,
>>       <span about="/user/markb/photo/23456" property="dc:title">
>>         Sunset in Nice
>>       </span>
>>     </li>
>>
>>     Why don't we use the same approach instead of using <span>?
>>
>>     <img src="/user/markb/photo/23456" property="dc:title">
>>       Sunset in Nice
>>     </img>
>>
>>     of ocurse this now the subject is src="". But we can still make  this
>> work to say for img, the "about" is the "src" attribute. See 2.5  below.
> 
> 
> This turns out to be one of our outstanding issues that we are still  
> finalizing. [1]
> 
> We are currently leaning towards the syntax you mention, where the  
> content of an image element could include metadata about that image  and 
> the SRC attribute would be the subject. Steven is checking with  the 
> XHTML working group to ensure that this does not cause any  unforeseen 
> complications. However, what's important to note is that,  even if this 
> syntax is adopted, the "Sunset in Nice" text in your  above example 
> would only be rendered in a browser if there is a  failure to load the 
> image.
> 
> This seems consistent with the fact that the image is really an  
> external resource, and any internal HTML element value should really  be 
> considered an ALT tag from the point of view of rendering. Note  that 
> the same would apply to OBJECT elements.
> 
>> 2.5) Flexible subject/object referrals suggestion.
>>
>> Motivation 1:
>>
>>      One thing that RDF/A has not considered is the annotation of HTML
>> forms. Imagine sofware agents understanding the form semantically and
>> automagically carryout complex form filling (beyond username,  passwords
>> and personal information) on behalf of the user. I believe forms'
>> annotations will be extremely important for the semantic web.
> 
> 
> Forms annotation is indeed important, and is already possible with  the 
> current RDF/A. Remember that any XHTML element can be annotated.  What 
> we should do is add an example in the primer to show how this  can be 
> done, something along the lines of (this is XHTML1, just to  explain the 
> principle):
> 
> ======
> <form method="post" action="/foobar">
>    <meta property="dc:description" content="Login Form" />
> 
>    <input type="text" name="username">
>       <meta property="dc:title" content="username" />
>    </input>
> ...
> 
> </form>
> =======
> 
> With proper annotations, browsers could become much smarter about  what 
> they do with these forms, as you mention.
> 
> 
>> Motivation 2:
>>
>>      The use of content, href, about, id, are ways to specify the
>> subject and the object/value of the rdf statements. I feel that  they are
>> somewhat restrictive, especially when the author acknowledges that  there
>> are still some unavoidable duplication of content.
>>
>>      To further reduce duplication of URIs and literals, as well as to
>> cater for annotating HTML forms in the future, it would seem a more
>> flexible approach may be possible.
>>
>>      Assuming the subject and object of the rdf statement can be taken
>> from existing XHTML (or XML) element attributes, one can completely
>> avoid duplication by 'referring' to those attributes from another, for
>> example:
>>
>>      . <img src="http://....." attrAsStmtSubject="src">
>>
>>      . Normally the attrAsStmtSubject defaults to "about" and "id"
>>
>>      . <a href="http://....." attrAsStmtObject="href">
>>
>>      . Normally the attrAsStmtObject defaults to "href" and thus could
>> be unspecified.
>>
>>      . Similarly attrAsStmtValue="content",  attrAsStmtValue="CDData", or
>> any other attributes/text element.
>>
>>       Although I have not worked out the details, but I believe these
>> three new attributes (attrAsStmtSubject and attrAsStmtValue/Object)  are
>> compatible with RDF/A concepts, and I believe they will allow forms to
>> be annotated without much content duplication.
> 
> 
> The task force feels that much of the motivations for these changes  
> could be accomplished without any additional complexity (see form  
> annotation above). Certainly, your suggestion would further reduce  data 
> duplication, but only with significant added complexity in RDF/ A. 
> Extracting triples would become far more complicated, as the  values of 
> certain attributes would affect the actual parsing of the  rest of the 
> document. Thus, at this point, we would not want to adopt  this 
> recommendation.
> 
> 
> Thanks for some very useful and insightful comments. Please let us  know 
> if these answers give rise to new questions.
> 
> -Ben Adida
> ben@mit.edu
> 
> [1] http://www.w3.org/2001/sw/BestPractices/HTML/2005-current-issues#src
> 

-- 
Free University Amsterdam, Computer Science
De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
Tel: +31 20 598 7739/7718; e-mail: schreiber@cs.vu.nl
Home page: http://www.cs.vu.nl/~guus/
Received on Monday, 23 January 2006 00:32:50 UTC