Re: review comments on rdf-syntax-grammar (version of 25 Mar) from Dave Beckett on 2002-03-24 (w3c-rdfcore-wg@w3.org from March 2002)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Sun, 24 Mar 2002 13:33:31 +0000
To: Dan Brickley <danbri@w3.org>
cc: w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
Message-ID: <10069.1016976811@tatooine.ilrt.bris.ac.uk>
>>>Dan Brickley said:
> Aside: I'm sending this from a Java SSH applet in a hotel, so apologies
> for the lack of detailed URIs into the document version I'm reviewing.
> 
> This is mostly terminology/wording. In a few cases I succeed in proposing
> specific edits, in others I don't (yet?) have suggested improvement text.

Not sure whether I can say I'll accept all these changes given I've
been given the goahead to publish based on the version I delivered
last Wed.

> See (6) below for the only new idea and the only major concern about the
> document's technical content. I don't consider either to be a showstopper
> issue, but busy readers might skim directly to (6) for further discussion.
> 
> 
> I am looking at http://ilrt.org/discovery/2001/07/rdf-syntax-grammar/ on
> Sunday March 24th. The WD-draft is currently dated 25 March; no CVS
> version number obviously visible (Dave, could you add $Id$ to the <h2/>
> while drafts are in preparation? (perhaps with a ptr to the ILRT cvsweb
> interface so we can see log comments, compare versions etc?).

Sigh.  All of those things are normally on the document, except when
it is in the final stages of publication, i.e now.  However, the
source shows you were probably looking at CVS V1.231

The CVS link, which I usually link to and post is:
  http://cvs.ilrt.org/cvsweb/redland/rdfcore/syntax/index.html

I say probably since I last changed the document on Friday when I
left work and went to the pub -- with you :)


> Comments:
> --------
> 
> I took a brief look at this in the week, while planning the RDFS
> reshuffle. I stick by my initial reaction: very nice work :)
> 
> Some detailed comments:
> 
> 1) Wordsmithing the Abstract (and terminology issues)

The abstract and status were what I was editing before I left; in
particular I had to sort out what went in each section.

> 
> In the abstract, we say
> 
>  'This W3c Working Draft introduces and defines the XML syntax...'
> 
> I would leave document status / maturity info to the 'SOTD' section.
> Instead we could say
> 
>  'This specification defines an XML syntax for RDF, as amended (etc)'

The standard XML syntax for RDF.  This is more than a particular XML
syntax that the group invented, it is the one that people should use
for interoperability and interchange and the one the WG is likely to
RECommend.

>  - dropping 'introduces' as this contradicts the claim that M+S'99
> introduced this syntax

Will change.

Maybe I should use the refactor, revises words more.

 
>  - s/the XML syntax/an XML syntax/
>  (we can't say often enough that RDF can be written in XML in various
> ways; referencing the Cambridge Communique, WebData and XLink-harvesting
> notes might help drum that point home..). Later in the doc it says 'An',
> not 'The', so this is a consistency fix.

'an' OK but something stronger needed too.

>  - s/creating RDF models/creating RDF graphs/
>  We're going to have to be careful with inter-spec terminology. Your
>  language here seems to have been largely obsoleted by the MT, which uses
>  'model' in the logicians sense, rather than the two(!) previous senses in
>  which  RDF employed the term.
>  We *used* to use 'model' as a noun corresponding roughly to 'RDF
> description of' (as in the practice of modeling, eg. a lego model).
>  We *also* used it to refer to RDF's basic architecture, the 'graph data
>  model', ie. triples as an encoding for 'statements' about the properties
>  of resources.
> 
> In short, various places you say 'model' might be safer as 'graph'

OK - will change.


> Aside: we have a similar problem with the MT and the Schema specs at
> least; they're in conflict about the use of the term 'vocabulary'. I plan
> to suggest the MT adopt a less useful word and free up 'vocabulary' for
> talking classes, properties etc. per current Schema usage. @@TODO
> 
> 
> 
> 2) Section (2), An XML syntax...
> 
> The 3rd paragraph tells us about rdf:Description. IMHO rdf:Description is
> a redundant historical relic, since we can always write rdfs:Resource
> there instead and have a typed node instead of mere encoding syntax.
> This is perhaps a matter of taste, and not a showstopper for publication.
> If we could introduce rdf:Description as a deviation from the general
> typed node / striped pattern, things might be a little easier for
> implementors. The rdf:Description construct adds nothing to the language
> except another syntactic variation. I wonder whether we could more
> explicitly encourage the use of typed nodes (minimal case: rdfs:Resource).
> Or would this be considered a fwd reference to the Schema spec and hence
> problematic? Perhaps not, if both (a) we get them to REC at same time, (b)
> we decide -- say in PR period -- that implementors like the new specs and
> would much prefer a single consolidated 'rdfcore' namespace for
> core classes and properties.

I see rdf:Description as the main way we explain a graph node to
people in syntax:
   [Node] -arc-> [Node]
which can't be done with typedNodes without introducing more rdf:type
arcs.  Typed nodes are an abbreviated form of rdf:Description which
is neither a deviation or a relic.

Recommending rdfs:Resource there instead isn't going to happen.  I've
never seen it used except in schema term definitions; rdf:Description
is used for that purpose.  Plus the

  <rdfs:Resource />

form always generates one more triple than

  <rdf:Description />

and the one it adds isn't necessary; since all nodes are of type resource.


Plus this adds the first dependency in this doc to RDF Schema as you
point out.

In summary: I'm not changing anything here.

> 
> Syntax / striping examples:
> Using CSS to style the property elements and the node elements differently
> might be useful.

Good idea, will do.


> 3) in (3) the RDF namespace
> 
> ...you list the classes, properties and syntactic gizmos from the rdf:
> namespace. Would it be worth separating them out, or at least separating
> the purely syntactic constructs?

It would require more explaining to introduce what the split meant,
so I'm leaving that for now.

I could do a syntax/non-syntax split for this WD since it turns up
later on in the grammar.


> Your 'implementors note' regarding the removal of names from the language
> (and namespace) may come across as rather casual.  Presumably implementors
> in this sense include content authors and tool authors. If we are telling
> them that docs they wrote to the '99 W3C REC are no longer legal RDF, we
> should be a little more cautious/humble.
> 
> Suggest: 'In this Working Draft, the RDF Core WG propose the removal of
> xyz from the RDF namespace. Feedback from tool and content authors is
> particularly sought on this point, and on the costs and benefits of
> adopting a new namespace URI to reflect this change. In the current
> draft, we simply omit these constructs from our account of the RDF
> namespace.'.

OK.  But we also asked for feedback about this in the last WD.
Nothing wrong with continuing that.  In future it might be good to
point to a section explaining ways to use syntax transforming
techniques to handle these things using XSLT.  I hope somebody else
writes that, or we can borrow from some existing XSLT work. (This is
Brian's suggestion).


> (hmmm, will we edit the RDF doc that's available at the ns URI?)

That would be a W3C process question.

> 
> (sincere apologies if I am re-visiting old territory here. I am doing my
> best  to catch up with RDF Core work, but I have missed fair chunks of
> discussion.)
> 
> 
> 4) in (3.5, Identifiers)
> 
> First sentence doesn't quite work. '3 types of identifiers (or labels)',
> ie. absolute urirefs, literals, unlabelled/blank nodes.
> 
> Suggest:
> RDF graphs are structured using three mechanisms for representing the
> identity of resources and literal values. These are: <mumble>
> 
> OK, I can't think of better wording so withdraw my quibble. However
> unlabelled/blank nodes don't seem to be a 'type of identifier or label',
> more an 'identification mechanism'. Tricky to find words for this. Maybe
> the MT spec has something?

I reworded it several times and it is still too long.  I might give
up and go for bullets which interrupts the flow but might make it clearer.


> 5) the word 'reify' is introduced deep in section 5.5 without explanation.
> Some gloss along lines of 'describing using RDF' might work.
> I have no concrete suggestions here.

I've nothing I can point to; it isn't in any other WD.  I'm not going
to explain reification in the syntax doc!  (Apart from take 1 triple,
and generate 3 more).


> 6) Serializing to RDF/XML

Substantially borrowed from some words by Jeremy. The errors are
probably mine.

> 
> Style: second sentence ('if you do...') is very casual, chatty, but
> carries a very important technical point.  Q: can we take the notion of
> 'round trip' as a concept everyone will understand, or is it too
> colloquial for a W3C spec? (we might seek unput from I18N folk on this).
> Suggested alternate wording: <mumble> we might talk about 'information
> loss'.

No idea.  We don't guarantee roundtripping in RDF/XML, and bare XML
doesn't either, so this is new ground that we haven't considered or
required IMHO.

 
> Substantial concern:

I'm not making any changes to this section for this version of the WD.

We haven't really spent much time in the WG discussing this part of
the syntax work; turning models^Wgraphs into RDF/XML.  Jeremy has
thought about it most in his Unparsing paper, which is cited.

> 
> Basic serialization strategy: 'all blank nodes are assigned arbitrary URIs'.
> 
>  - on my understanding of our decisions w.r.t. blank nodes, this would
> result in information loss. Just as we decided not to autogenerate uriref
> node labels for bNodes in the graph, we we need to preserve that (lack of)
> information on re-serialization. DanC recently circulated a good brief
> explanation of this (@@todo, find msg).
> 
> I'm not sure what to suggest as an immediate fix. We might add:
> 
> 'editorial note: this serialization strategy loses information,
> specifically it destroys any trace of the distinction between bNodes and
> uriref-labelled nodes in any RDF graph serialized to XML/RDF following
> these rules. Future revisions of this specification may provide revised
> rules that preserve the bNode / labeled node distinction'.

There is already a warning on information loss.

I don't like to predict the future.  If we record that we recognise
this as an issue that can't be fixed now, that is as much as we should do.

I think the group already knows that if it had been allowed to invent
a new syntax, I wouldn't be starting from here.

> also in the serialization section....
> 
> My one (hopefully) new idea:
> 
> I've been thinking about strategies for dealing with 'unserializable'
> graphs. These are (AFAIK) all (or nearly all?? would be good to be clear)
> to do with having the edges in some RDF graph be labelled with a URIref
> that doesn't split conveniently into  namespace name and local name.
> 
> (i) (as an aside) we should note that the rdfs:isDefinedBy property can be
> used to represent in RDF the relationship between a property and its 'home'
> namespace. When available, this information should be used in preference
> to  URI-syntax-scraping.

In that case, we must also strongly advise people to use it in
defining RDF properties etc. and relating them to the namespace URI.

> 
> (ii) When a property URIref is not suitably serializable, an alternative
> to exception/error throwing that MAY often be useful is for the RDF/XML
> parser to describe problem triples in terms of a previously unknown
> sub-property of the annoyingly named Property:
> 
> Example:
> 
> <http://www.w3.org/TR/soap12-part1/>  <http://example.com/annoying-ns!>  "Etc
    ." .
> 
> Since our predicate ends in '!', we can't serialize this as-is.
> 
> However we could emit something like:
> <http://www.w3.org/TR/soap12-part1/>  <uuid:123321123321123321/genprop>   "Etc." .
> 
> ...accompanied by adjunct triples describing how the property used is a
> sub-property of the annoying one:
> 
>  <uuid:123321123321123321/ns1>  <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://example.com/annoying-ns!>   .
> 
> In RDF/XML this would be
> 
> <rdf:Property rdf:about="uuid:123321123321123321/genprop">
> <rdfs:subPropertyOf rdf:resource="http://example.com/annoying-ns!"/>
> </rdf:Property>
> <rdf:Description rdf:about="http://www.w3.org/TR/soap12-part1/">
>  <g1:genprop xmlns:g1="uuid:123321123321123321/">Etc.</g1:genprop>
> </rdf:Description>
> 
> Notes:
> 
>  - this is something of a hack, and perhaps belongs in a Developers  'hints
>    and tips' FAQ rather than the core Syntax spec.

Yes.  In particular 'uuids'

Plus it requires a schema aware system to understand it.


>  - it does suggest that all RDF graphs may be XML-serializable, but at the
>    cost that the generated XML may contain triples that weren't in the
>    original data
>  - it provides no guidance on generation of URIrefs for the generated
>    properties
>  - it leaves no hint in the output RDF/XML that this augmentation happened
> (ie. subsequent parsers might want to reverse the process. Using a
> different property than rdfs:subPropertyOf would be one complication to
> consider. But I'm not sure complication is what we need right now.)
> 
> Suggestion:
>  - add this technique to the Primer or RDF FAQ and put a reference in the
> syntax spec to give folk some option other than warning/exception
> throwing.

I'm sure Frank wouldn't want the above stuff in it since it is very
detailed syntax-related and needs more work to be in a state for
giving out as general advice.

We decided at the F2F meeting to throw the warning, that is why that
text is there.

Dave
Received on Sunday, 24 March 2002 08:33:34 UTC