W3C home > Mailing lists > Public > public-grddl-wg@w3.org > February 2007

Comments on draft spec

From: Danny Ayers <danny.ayers@gmail.com>
Date: Sun, 11 Feb 2007 22:01:32 +0100
Message-ID: <1f2ed5cd0702111301xe85449bgc6ca8fdc3de00eb9@mail.gmail.com>
To: "Harry Halpin" <hhalpin@ibiblio.org>
Cc: public-grddl-wg <public-grddl-wg@w3.org>

re. http://www.w3.org/2004/01/rdxh/spec
Revision: 1.206

Overall I think it's pretty close to what's required to go to Last
Call, but there are quite a few not-major editorial changes I'd
suggest (below). One query on a green box, entirely possible I'm
misreading that.

[I've raised this in another email thread] The only significant thing
maybe lacking is informative material specifically on GRDDL's
recursion - maybe a sentence or two added to the end of the
introduction (suggestion inline below) plus a paragraph entitled
something like "Recursion Stopping Conditions".

Cheers,
Danny.

---

*** Abstract
"glean" is used many times without definition, and it isn't clear that
"Resource Descriptions" and "data" are synonymous as used here (and
"Dialects" is a bit confusing too - see below).

A little rewording might help, e.g.
[[
This GRDDL specification introduces markup for declaring that an XML
document includes gleanable data and for linking to algorithms,
typically represented in XSLT, for gleaning the resource descriptions
from the document.
]]
=>
[[
This GRDDL specification introduces markup based on existing standards
for declaring that an XML document includes data compatible with the
Resource Description Framework (RDF) and for linking to algorithms
(typically represented in XSLT), for extracting this data from the
document.
]]

last sentence
[[
A GRDDL Primer demonstrates the mechanism on XHTML documents which
include widely-deployed dialects, more recently known as microformats.
]]
- maybe strike ", more recently"

*** Table of Contents
...
4. The GRDDL profile for XHTML
5. GRDDL for HTML Profiles
...
- could those headings be made a bit more explicit (to sound less
similar)? Seems potentially confusing.

*** 1. Introduction: Data and Documents
I think "dialect" needs a little pinning down somewhere, maybe here:
[[
There are many dialects of languages in practice among the many XML
documents on the web.
]]
=>
[[
There are many domain-specific languages ("dialects") used in practice
among the many XML documents on the web.
]]
(Dunno : are iTunes and Audioscrobbler different XML languages?
Different dialects of the XML language? Different domain languages
expressed in XML? The same domain language expressed in different
dialects?)

The examples could do with a line of introduction, something like:
[[
The following are examples of how the same musical work might be
described in different XML dialects:
]]
and followed by:
[[
Although the examples describe the exact same thing, as it stands
there's no clear mechanism through which computer software might be
able to make this connection.
]]

The RDF/XML example doesn't declare the dc namespace.
Also, might it be preferable to show this as Turtle/N3, to make it
clearer that is isn't just some weird kind of canonical XML
expression? (Or at least add another sentence to stress the point).

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

<http://musicbrainz.org/mm-2.1/album/6b050dcf-7ab1-456d-9e1b-c3c41c18eed2>
   dc:title "Are You Experienced?";
   foaf:maker <http://musicbrainz.org/mm-2.1/artist/33b3c323-77c2-417c-a5b4-af7e6a111cc9>
.

<http://musicbrainz.org/mm-2.1/artist/33b3c323-77c2-417c-a5b4-af7e6a111cc9>
   a foaf:Agent;
   foaf:name "The Jimi Hendrix Experience" .

Again there's no intro for the markup, suggestion:
[[
   Here's the information contained in the XML fragments above, this
time expressed as RDF:
]]
outro -
[[
   Both the entities (subject and object resources) and relationships
(predicates) are identified using unambiguous URIs.
]]

[[
GRDDL stands for Gleaning Resource Descriptions from Dialects of
Languages. That is, GRDDL provides a relatively inexpensive mechanism
for bootstrapping RDF content from uniform XML dialects; shifting the
burden from formulating RDF to creating transformation algorithms
specifically for each dialect.
]]
=>
[[
The publishers of the XML above could also provide the same data in
RDF using RDF/XML or one of the other RDF syntaxes. GRDDL provides a
relatively inexpensive mechanism for bootstrapping RDF content from
uniform XML dialects; shifting the burden from formulating RDF to
creating transformation algorithms specifically for each dialect.
]]

*** Faithful Renditions
[[
By specifying a GRDDL transformation, the author of a document states
that the transformation will provide a faithful rendition of the
source document, or some portion of the source document, that
preserves its meaning in RDF.
]]
I think that's potentially misleading, maybe better something like:
[[
By specifying a GRDDL transformation, the author of a document states
that the transformation will provide a faithful rendition in RDF of
information (or some portion of the information) expressed through the
XML dialect used in the source document.
]]

Same again, but given a change in the above, less needs saying:
[[
Likewise, by specifying a GRDDL namespace transformation or profile
transformation, the creator of that namespace or profile states that
the transformation will provide a faithful rendition of a class of
source documents which relate to that namespace or profile, or some
portion of such a source document, that preserves its meaning in RDF.
A namespace document or a profile document also provide a means for
their authors to explain, prosaically, the purpose of the
transformation or any policy statements.
]]
=>
[[
Likewise, by specifying a GRDDL namespace transformation or profile
transformation, the creator of that namespace or profile states that
the transformation will provide a faithful RDF rendition of a class of
source documents which relate to that namespace or profile. A
namespace document or a profile document also provide a means for
their authors to explain in prose the purpose of the transformation or
any policy statements.
]]

*** GRDDL Primer
This should maybe moved to a "Related Documents" block, somewhere near
the start.
[[
...It develops on a number of examples...
]]
strike "on"

*** GRDDL Use Cases
This should maybe moved to a "related documents" block, somewhere near
the start.
[[
The use cases document[usecases] collects a number of use cases
together with their goals and requirements for GRDDL. These use cases
also illustrate how XML and XHTML documents can be decorated with
microformat, Embedded RDF or RDFa statements to support GRDDL
transformations in charge of extracting valuable data that can then be
used to automate a variety of tasks.
]]
=>
[[
This document [usecases] collects a number of use cases together with
their goals and requirements for GRDDL. It also illustrates how XML
and XHTML documents can be decorated with microformat, Embedded RDF or
RDFa statements to support GRDDL transformations to enable the
extraction of valuable data that can then be used to automate a
variety of tasks.
]]

*** GRDDL Specification
- redundant?

!! Maybe add:
[[
*** GRDDL Recursion
The mechanisms of GRDDL which will be described in the next section
enable the interpretation of either an individual XML/HTML document,
or a whole class of documents as RDF. In the latter case the
association between individual documents and the appropriate
transformations is done indirectly through linkage (dereference over
HTTP). The documents specifying which transformations to apply may
themselves need transformation from an XML/HTML dialect to RDF to make
this information available. Hence in the general case recursion
through multiple layers of linked documents may be required to
complete a faithful RDF rendition of a source document. This recursion
should be transparent to end-users of GRDDL-enabled systems, but is
something implementers of GRDDL-enabled systems will need to consider.
]]


*** 2. Adding GRDDL to well-formed XML
[[
The general form of associating a GRDDL transformation link with a
well-formed XML document is by adding...
]]
strike "by"

[[
...This method is suitable for use with any XML dialects that can
accomodate an extra attribute on the root element.
]]
=>
[[
...This method is suitable for use with any XML dialects that can
accomodate an extra namespace-qualified attribute on the root element.
]]

"information resource" is used in the second green box - ref. WebArch
Section 2.2.

*** 3. Using GRDDL with XML Namespace Documents

the green box on namespace transformations doesn't seem quite right -
[[
If
    * an information resource NSDOC, identified by a URI NS,
represented by an XML document with root node NODE with a GRDDL result
that includes a triple whose
          o subject is NSDOC, whose
          o predicate is the property
<http://www.w3.org/2003/g/data-view#namespaceTransformation>, and
whose
          o object is TX,
    * and an information resource IR has an XML representation whose
root element's namespace name is NS,

then TX is a GRDDL transformation of NODE.
]]
Presumably this needs an "there exists" inserting, but aside from that -
second clause:
"and an information resource IR has an XML representation whose root
element's namespace name is NS"
what has this to do with the right-hand side of the implication?

4. Using GRDDL with valid XHTML

I'd be tempted to shift the "Stated more formally:" green boxes past
the first example - the informal paragraph is pretty confusing itself.

The second & third green boxes - defining "profile" & "typed link" -
appear to be formalisations of part of a spec cited normatively, which
seems back-to-front - is a comment needed? (HTML4 is pretty loose on
those point ;-)

5. GRDDL for HTML Profiles

(seems ok apart from an @@ already marked)

6. Transformation Algorithms
[[
Developers of transformations should make available representations in
widely-supported formats.
]]
Doesn't sound right. Not sure why...

[[
While javascript, C, or any other programming language technically
expresses the relevant information, XSLT is specifically designed to
express XML to XML transformations and has some good safety
characteristics.
]]
=>
[[
While technically Javascript, C, or virtually any other programming
language could be used to express transformations for GRDDL, XSLT is
specifically designed to express XML to XML transformations and has
some good safety characteristics.
]]

If this is true, it should be bolder!
[[
When an information resource is represented by an XML document, the
corresponding XPath data model is somewhat under-determined, depending
on, for example, whether an agent elaborates inclusions, parameter
entities, fixed and default attributes, or checks digital signatures.
]]
=>
[[
When an information resource is represented by an XML document, the
corresponding XPath data model may not be fully determined, depending
on, for example, whether an agent elaborates inclusions, parameter
entities, fixed and default attributes, or checks digital signatures.
]]
This, and the material that follows "Put another way..." tends toward
the discursive/informative, should maybe be broken out into a separate
section, following the more normative material.

7. Security considerations

Seems ok, except the caching section isn't really relevant - break out
to an appendix: "Notes for Implementors"??

7. (should be 8.) The GRDDL Vocabulary

There's already an @@, looks like it's on DanC's plate - I assume an
RDF/XML version will follow...

8. There is no 8.

9. References

Can this section be moved /after/ the appendices? I nearly overlooked
the content that follows it.

[[
Parts of the following specifications are include in this one by reference:
]]

- strike (or find better boilerplate)

*** Appendix: A GRDDL-aware Agent protocol trace

I like it ;-)

Could a line be added saying which tools were used to create the trace?
(/me wonders if DanC favours telnet as a browser)

*** Appendix: Transformations for Styling versus data extraction

Can't say I like this as it stands in this position, but can't think
how to improve.

*** Appendix: Issues

-

*** Appendix: Implementation Experience: Test Cases, Software, and Services

Maybe move this together with any other related material into a "Note
for Implementors" appendix

*** Acknowledgements and Change History

-


-- 

http://dannyayers.com
Received on Sunday, 11 February 2007 21:01:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:11:47 GMT