- From: Dan Brickley <danbri@w3.org>
- Date: Sun, 24 Mar 2002 07:11:49 -0500 (EST)
- To: <dave.beckett@bris.ac.uk>, <w3c-rdfcore-wg@w3.org>
Aside: I'm sending this from a Java SSH applet in a hotel, so apologies for the lack of detailed URIs into the document version I'm reviewing. This is mostly terminology/wording. In a few cases I succeed in proposing specific edits, in others I don't (yet?) have suggested improvement text. See (6) below for the only new idea and the only major concern about the document's technical content. I don't consider either to be a showstopper issue, but busy readers might skim directly to (6) for further discussion. I am looking at http://ilrt.org/discovery/2001/07/rdf-syntax-grammar/ on Sunday March 24th. The WD-draft is currently dated 25 March; no CVS version number obviously visible (Dave, could you add $Id$ to the <h2/> while drafts are in preparation? (perhaps with a ptr to the ILRT cvsweb interface so we can see log comments, compare versions etc?). Comments: -------- I took a brief look at this in the week, while planning the RDFS reshuffle. I stick by my initial reaction: very nice work :) Some detailed comments: 1) Wordsmithing the Abstract (and terminology issues) In the abstract, we say 'This W3c Working Draft introduces and defines the XML syntax...' I would leave document status / maturity info to the 'SOTD' section. Instead we could say 'This specification defines an XML syntax for RDF, as amended (etc)' - dropping 'introduces' as this contradicts the claim that M+S'99 introduced this syntax - s/the XML syntax/an XML syntax/ (we can't say often enough that RDF can be written in XML in various ways; referencing the Cambridge Communique, WebData and XLink-harvesting notes might help drum that point home..). Later in the doc it says 'An', not 'The', so this is a consistency fix. - s/creating RDF models/creating RDF graphs/ We're going to have to be careful with inter-spec terminology. Your language here seems to have been largely obsoleted by the MT, which uses 'model' in the logicians sense, rather than the two(!) previous senses in which RDF employed the term. We *used* to use 'model' as a noun corresponding roughly to 'RDF description of' (as in the practice of modeling, eg. a lego model). We *also* used it to refer to RDF's basic architecture, the 'graph data model', ie. triples as an encoding for 'statements' about the properties of resources. In short, various places you say 'model' might be safer as 'graph' Aside: we have a similar problem with the MT and the Schema specs at least; they're in conflict about the use of the term 'vocabulary'. I plan to suggest the MT adopt a less useful word and free up 'vocabulary' for talking classes, properties etc. per current Schema usage. @@TODO 2) Section (2), An XML syntax... The 3rd paragraph tells us about rdf:Description. IMHO rdf:Description is a redundant historical relic, since we can always write rdfs:Resource there instead and have a typed node instead of mere encoding syntax. This is perhaps a matter of taste, and not a showstopper for publication. If we could introduce rdf:Description as a deviation from the general typed node / striped pattern, things might be a little easier for implementors. The rdf:Description construct adds nothing to the language except another syntactic variation. I wonder whether we could more explicitly encourage the use of typed nodes (minimal case: rdfs:Resource). Or would this be considered a fwd reference to the Schema spec and hence problematic? Perhaps not, if both (a) we get them to REC at same time, (b) we decide -- say in PR period -- that implementors like the new specs and would much prefer a single consolidated 'rdfcore' namespace for core classes and properties. Syntax / striping examples: Using CSS to style the property elements and the node elements differently might be useful. 3) in (3) the RDF namespace ...you list the classes, properties and syntactic gizmos from the rdf: namespace. Would it be worth separating them out, or at least separating the purely syntactic constructs? Your 'implementors note' regarding the removal of names from the language (and namespace) may come across as rather casual. Presumably implementors in this sense include content authors and tool authors. If we are telling them that docs they wrote to the '99 W3C REC are no longer legal RDF, we should be a little more cautious/humble. Suggest: 'In this Working Draft, the RDF Core WG propose the removal of xyz from the RDF namespace. Feedback from tool and content authors is particularly sought on this point, and on the costs and benefits of adopting a new namespace URI to reflect this change. In the current draft, we simply omit these constructs from our account of the RDF namespace.'. (hmmm, will we edit the RDF doc that's available at the ns URI?) (sincere apologies if I am re-visiting old territory here. I am doing my best to catch up with RDF Core work, but I have missed fair chunks of discussion.) 4) in (3.5, Identifiers) First sentence doesn't quite work. '3 types of identifiers (or labels)', ie. absolute urirefs, literals, unlabelled/blank nodes. Suggest: RDF graphs are structured using three mechanisms for representing the identity of resources and literal values. These are: <mumble> OK, I can't think of better wording so withdraw my quibble. However unlabelled/blank nodes don't seem to be a 'type of identifier or label', more an 'identification mechanism'. Tricky to find words for this. Maybe the MT spec has something? 5) the word 'reify' is introduced deep in section 5.5 without explanation. Some gloss along lines of 'describing using RDF' might work. I have no concrete suggestions here. 6) Serializing to RDF/XML Style: second sentence ('if you do...') is very casual, chatty, but carries a very important technical point. Q: can we take the notion of 'round trip' as a concept everyone will understand, or is it too colloquial for a W3C spec? (we might seek unput from I18N folk on this). Suggested alternate wording: <mumble> we might talk about 'information loss'. Substantial concern: Basic serialization strategy: 'all blank nodes are assigned arbitrary URIs'. - on my understanding of our decisions w.r.t. blank nodes, this would result in information loss. Just as we decided not to autogenerate uriref node labels for bNodes in the graph, we we need to preserve that (lack of) information on re-serialization. DanC recently circulated a good brief explanation of this (@@todo, find msg). I'm not sure what to suggest as an immediate fix. We might add: 'editorial note: this serialization strategy loses information, specifically it destroys any trace of the distinction between bNodes and uriref-labelled nodes in any RDF graph serialized to XML/RDF following these rules. Future revisions of this specification may provide revised rules that preserve the bNode / labeled node distinction'. also in the serialization section.... My one (hopefully) new idea: I've been thinking about strategies for dealing with 'unserializable' graphs. These are (AFAIK) all (or nearly all?? would be good to be clear) to do with having the edges in some RDF graph be labelled with a URIref that doesn't split conveniently into namespace name and local name. (i) (as an aside) we should note that the rdfs:isDefinedBy property can be used to represent in RDF the relationship between a property and its 'home' namespace. When available, this information should be used in preference to URI-syntax-scraping. (ii) When a property URIref is not suitably serializable, an alternative to exception/error throwing that MAY often be useful is for the RDF/XML parser to describe problem triples in terms of a previously unknown sub-property of the annoyingly named Property: Example: <http://www.w3.org/TR/soap12-part1/> <http://example.com/annoying-ns!> "Etc." . Since our predicate ends in '!', we can't serialize this as-is. However we could emit something like: <http://www.w3.org/TR/soap12-part1/> <uuid:123321123321123321/genprop> "Etc." . ...accompanied by adjunct triples describing how the property used is a sub-property of the annoying one: <uuid:123321123321123321/ns1> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://example.com/annoying-ns!> . In RDF/XML this would be <rdf:Property rdf:about="uuid:123321123321123321/genprop"> <rdfs:subPropertyOf rdf:resource="http://example.com/annoying-ns!"/> </rdf:Property> <rdf:Description rdf:about="http://www.w3.org/TR/soap12-part1/"> <g1:genprop xmlns:g1="uuid:123321123321123321/">Etc.</g1:genprop> </rdf:Description> Notes: - this is something of a hack, and perhaps belongs in a Developers 'hints and tips' FAQ rather than the core Syntax spec. - it does suggest that all RDF graphs may be XML-serializable, but at the cost that the generated XML may contain triples that weren't in the original data - it provides no guidance on generation of URIrefs for the generated properties - it leaves no hint in the output RDF/XML that this augmentation happened (ie. subsequent parsers might want to reverse the process. Using a different property than rdfs:subPropertyOf would be one complication to consider. But I'm not sure complication is what we need right now.) Suggestion: - add this technique to the Primer or RDF FAQ and put a reference in the syntax spec to give folk some option other than warning/exception throwing.
Received on Sunday, 24 March 2002 07:11:51 UTC