- From: Guus Schreiber <guus.schreiber@vu.nl>
- Date: Wed, 27 Nov 2013 16:23:14 +0100
- To: Pat Hayes <phayes@ihmc.us>
- CC: Yves Raimond <Yves.Raimond@bbc.co.uk>, RDF WG <public-rdf-wg@w3.org>
Pat, Here is the first set of responses to your comments. The responses concern your comments on Secs. 1-5. Secs. 6+ to follow. Guus > First pass of major howlers, I will get back with more details and suggestions for > replacements later. > > Pat > ---------- > > First para. The examples are very atypical and misleading. RDF does not do > times well, and it is not mostly used for annotating Web pages or videos, and > 'resources' does not mean just Webbish things. Might be better to use some > DBpedia examples right off the bat, and talk explicitly about *data* rather than > annotation. Hmm. We were intending to include some “data” examples further-on in the document (in the RDF Data section). But I’m surprised you consider annotation to be atypical. Of course, Yves and I are a bit biased (due to our RDF work on music, TV, musea, archives, libraries). But there is lots of RDF annotations out there. And isn’t most of DBpedia in fact annotation? We take the examples of a well-known person and painting to get the intuition about RDF across. I would like to discuss this a bit more before changing. > 3.1. It is misleading to describe the <subject> as being what the statement is > "about". A triple is as much "about" its object as it is about its subject. (BTW, this > bad idea was one of the drivers behind the design of RDF/XML, which gives you > some idea of what is bad about it :-) OK, point taken. BTW Turtle does the same thing in the shorthand notation. However, I’m not sure this is a subtlety the Primer should care about. If we say that a sentence has a “subject” we as humans mean that the sentence is “about” the subject”, don’t we? Of course, it says also things about the object (and about the predicate). It may confuse people if we don’t use “subject” in the usual way. > Also, this may be rather pedantic, but the S P O terminology refers to the parts of > the triple, ie it is RDF 'grammar', rather than what these IRIs refer to. So the > predicate (is an IRI which) refers to the property (which is a real thing in the > world that relates other things to one another.) I don't mean to suggest putting > this into the primer, but it might be good to keep it in mind and use the > terminology consistently throughout. The usage of 'sentences' versus 'facts' > might be useful here.(?) Fully agree. Actually, I think we’ve tried to make this distinction throughout the rest of the document, but you’re right it is not here. We maybe should say: RDF Fact: resource property resource RDF Sentence: subject predicate object but it actual traditional for introductory RDF texts to say only the latter. Hmm, I suggest we discuss this a bit more before changing. > The terminology of "feature" for the property is not standard and not particularly > helpful to the reader. Right. I suggest to simply say “property”. > Why do you say that the subject IS what the triple is about, whereas the object > REPRESENTS the value of the feature? This looks like a use/mention confusion. > I would suggest avoiding the word "value" altogether, as it seems to generate > confusion wherever it appears. Now rephrased as: [[ The subject represents the resource we like to make a statement about. The predicate represents a property of the subject. The object represents the value of the property for this subject. Because RDF statements consist of three elements they are called triples. ]] I take your point about “value”, but I am at a loss for another term. I don’t think that “value” creates much confusion here. The term “property value” is very common. > The example <This video document> is misleading: RDF has nothing like the > "this document" construction. In fact, all the examples in example 1 are > misleading as they suggest that RDF uses English prose fragments rather than > IRIs. Well, the purpose here was to use English prose. I wouldn’t like to get rid of that. But the wording of the video example is misleading. Suggest to rephrase as Video xyz, or maybe better, BBC program xyz. > The last sentence about the "three basic constructs" is therefore puzzling > as it comes without any explanation or introduction. Indeed. Suggest to delete this paragraph. In the sections itself it becomes clear enough. As n aside: would it be useful to include a summary like “subject => IRI or blank node; predicate => IRI; object => all three” somewhere? > It is misleading to use the phrase "anonymous resources" when talking about > blank nodes. This phrasing suggests that IRIs and bnodes denote different > *kinds* of resource, which is misleading. (It is like saying that the pronoun > "someone" refers to a nameless person, a distinct category from people with > names.) Right, although I’m not sure many people will be misled by the term “anonymous”. Suggested rephrasing: [[ In addition, it is sometimes handy to be able to talk about resources which have no identifier. For example, we might want to state that the Mona Lisa painting has in its background an unidentified tree which we know to be a cypress tree. Resources such as the unidentified cypress tree are called "blank nodes" in RDF. ]] > Your Mona Lisa example is strange as there is an obvious name to use there, > and (not surprisingly) a dbpedia IRI: > http://dbpedia.org/resource/Leonardo_da_Vinci. A more plausible example of > bnode use might be saying that the Mona Lisa has in its background an X and X > is a Cypress tree. That is the kind of information that makes it genuinely > implausible to assume that there is an IRI for the value, and it is the kind of thing > that one might well want to record in for example a museum guide: > > http://dbpedia.org/resource/Mona_Lisa http://purl.org/net/lio#shows _:x . > _:x http://www.w3.org/1999/U2/22-rdf-syntax-ns#type > http://dbpedia.org/resource/Cypress . See rephrasing above. We should consider including the Cypress tree example in the overall example in the Syntax section. > "It should be noted that many RDF users in practice don't use blank nodes." No, > it should not be noted. A recent scan found that over half of published RDF > graphs use blank nodes, most (all?) OWL/RDF contains blank nodes, etc.. RDF > could not function without blank nodes, and it is time to forget the brain-dead > doctrine which says that their use should be avoided. And it is just silly to say in > a primer that blank nodes make RDF "look complicated". What could be simpler > than a blank node in a graph? Well, blank nodes definitely make the normative specifications much harder to read. That is what I wanted, admittedly very poorly, to bring across. Suggest to delete for now. > "We can then make statements about these two graphs, for example adding > license and provenance information: > > <http://example.com/bob> <is published by> <http://example.org>. > <http://example.com/bob> <has license> > <http://creativecommons.org/licenses/by/3.0/>." > > <hair tearing> AAAARRRRGGGH</hair tearing> NO WE CAN'T. Or at least, this > use is NOT SUPPORTED BY RDF with the specs in their current state. That > 'metadata' use works ONLY when we know that the "identifying" graph IRIs > denote their graphs, and WE HAVE EXPLICITLY SAID THAT RDF DOES NOT > ASSUME THIS. A conforming RDF engine would be perfectly conforming if it > refused to treat those subject IRIs as denoting the graph in these triples. There > is NOTHING in the RDF specs that say that a general IRI must be taken to > denote what it conventionally identifies. We do this only for datatype IRIs, and > even getting that much into the specs was an uphill struggle; and in the case of > graph labels in a dataset, we explicitly warn people to not expect this to be true > (because it often isn't.) I know the Primer has to be simple, but please let us not > put actual lies into it. I don’t think the Primer says any of this (or certainly doesn’t want to). We can write down these statements, no problem. Would a rephrase like this help: [[ We can then write down triples that include the graph names, for example: <example> These two triples could be interpreted as license and provenance information of the graph <xyz. (And then a note about lack of RDF semantics for this). ]] > "The original data model assumed that all triples are part of the same (large) > graph." The RDF data model still assumes this. It is very misleading to suggest > that datasets are a modification to the basic RDF data model, or even that this > data model has changed. We considered such changes and rejected them. I deleted this sentence. > "The RDF data model provides a way to make statements about (Web) > resources." > > RDF makes statements about resources, i.e. about anything at all. The implied > qualification in "(Web)" is false and misleading. Right. Deleted. > "As we mentioned, this data model does not make any assumptions about what > these resources stand for." > > Resources don't (usually) stand for anything. Did you mean, what IRIs stand for? > (Another use/mention confusion.) Oops, right. Corrected. > ".. a vocabulary description language called RDF-Schema " ?? In what sense is > RDFS a 'vocabulary description' language? If you must say this, a least explain > what it is supposed to mean. It is actually the title of the document (“RDF Vocabulary Description Language 1.0 RDF Schema”). Bit I agree with your point. Rephrased now as “To support the definition of vocabularies RDF provides the RDF-Schema language.”. It raises another point: should we rename the RDF Schema document? > The introduction of classes is very awkward, and not really correct. I would > suggest saying that they are categories which can be used to classify things: Bill > rdf:type Human, Mona Lisa rdf:type Artwork, etc.. and avoid "group" (and "set") > altogether. Then you don't need to immediately say that what you just said is > false, which is not very reassuring for the reader. And you should mention > rdf:type in the same breath. Right. I was struggling with this. New phrasing in document. > Need to clean up the terminology. It is very confusing to be told in quick > succession: > > ==RDF Schema is a vocabulary description language > ==FOAF is a *vocabulary* which is a *schema* which was one of the first *RDF > Schemas* (Is RDF Schema one of the RDF Schemas?) > > ==DC is a vocabulary which is a *metadata element set* (Why isn't it an RDF > Schema, like FOAF?) > ==*schema*.org is a *vocabulary* (Given what it is called, why isn't it a schema?) > ==SKOS is a *vocabulary* for publishing *schemes* (not schemas?) such as > terminologies and thesauri. (Isn't a terminology a kind of vocabulary? So are > schemes and vocabularies the same thing? Or was it schemas that were like > vocabularies, and schemes are something different? And does SKOS describe > them or is it an example of one of them? Or maybe both at the same time??) > > Personally, I would never want to see the word "schema" ever again. In principle agreed. But the term “schema” is used in the outside world in ambiguous ways; nothing we can do about that. But I deleted the term “schema” from our own text, that i indeed better. Two specific things: - Saying that schema.org is a vocabulary is, I think, correct. - SKOS is a meta-vocabulary for specifying classification schemes/… > Why don't we just say that anyone can publish an RDF vocabulary - a set of IRIs, > typically from a single namespace - and specify what it is supposed to mean, > and then everyone can then use that vocabulary to write RDF data. It is good > practice to have the root namespace IRI link to something that defines the > meaning of the vocabulary, and to re-use IRIs from existing vocabularies where > you can, to make it easier to share meanings. And it is gold standard to publish > an RDF graph which specifies at least part of your intended meanings for your > vocabulary in a machine-readable way, if you can, using vocabularies intended > for the purpose, for example RDFS or OWL or SKOS, because then they can be > used by others in entailment rules. Example are given below.... OK, will use this for rephrasing parts of this section. > and then after you have talked about entailments in the semantics section, you > might for example show how dbpedia uses rdfs:subClassOf to create category > hierarchies, or how FOAF uses owl:inverseFunctionalProperty to imitate > database keys. Good suggestion. Will include this. > "RDF Schema provides basic facilities for modeling semantics of RDF data. For > a specification of these semantics the reader is referred to the RDF Semantics > document [RDF11-MT]. For more comprehensive semantic modeling of RDF > data the W3C recommends using the Web Ontology Language OWL > [OWL2-OVERVIEW]." > > ?? I don't even know what this is supposed to mean. RDFS and OWL "model > semantics of RDF data" ?? That is either meaningless or false, I'm not sure > which. Maybe both. Also, this reads as though the W3C recommends using OWL > over RDFS, which if true is news to me (and not likely to lead to a rapid take-up > of RDF, if users have to read the OWL specs first.) I wanted to say something nice about OWL! :). Seriously, suggest to rephrase as: [[ For a formal specification of the semantics of the RDF Schema constructs the reader is referred to the RDF Semantics document [[RDF11-MT]]. Users interested in more comprehensive semantic modeling of RDF data might consider using the Web Ontology Language OWL [[OWL2-OVERVIEW]] ]] > Section 5. The idea that all these different syntaxes are all ways of describing > the same RDF graph structures is not immediately obvious, and I think is a major > barrier to comprehension. Need to talk a little about concrete vs. abstract > syntax, maybe not in those terms, but to get across the idea of the graph syntax > being a level of abstraction higher than the particular notation used to describe it. Good point. Included the following sentence in the first paragraph: [[ However, different encodings of the same graph lead to exactly the same triples. ]] I also suggest to include a graph diagram of the current example, and clarify the point about the abstract graph (added as a todo issue to the document). > Having one simple but not entirely trivial example graph (with at least one bnode, > at least two triples sharing a common object and one node used as both a > subject and an object) written out in all the different notations would be a very > useful thing to see. It would also hammer home the point about abstract graph > syntax, especially if you also provided a graph diagram for it. The current example has all these features, except for the bnode. I’ll add an issue about including a (separate?) example with bnodes. > " therefore bringing the benefits of RDF to the JSON world. " Omit. Could be > read as condescending. I am sure there are many who would say, it brings JSON > sanity to the RDF world. Deleted/rephrased.
Received on Wednesday, 27 November 2013 15:23:39 UTC