Re: comments on primer so far from Frank Manola on 2002-11-05 (w3c-rdfcore-wg@w3.org from November 2002)

From: Frank Manola <fmanola@mitre.org>
Date: Mon, 04 Nov 2002 21:24:16 -0500
To: Frank Manola <fmanola@mitre.org>
CC: Brian McBride <bwm@hplb.hpl.hp.com>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <3DC72BD0.8060405@mitre.org>
Brian--


Some comments on your comments (I'm not going to comment on all of them, 
  just the ones where I either question the call, would like some more 
input, or otherwise feel like wrangling about):

Section 1:
[[If you were to allow me one silver bullet, one stylistic change you 
made just because I asked for it, it would be this one(he says not 
having read the rest of the document yet.) The first time a reader sees 
RDF they should see a graph, not RDF/XML. For me, it is very important 
to get the reader thinking about graphs, not XML, right from the get 
go.]] (Brian's comments are deliminted by [[ ]]  )

I understand your point.  The problem is that we've just got through 
talking about how useful RDF is for expressing information so it can be 
exchanged between applications, and so on.  While the model/abstract 
syntax is a graph, the only way the graph can be exchanged between 
applications is to write them down, and the normative syntax for doing 
that is RDF/XML.  I really do understand that the graph is the "essence" 
  of RDF;  but it seems to me that at this point (where we say we're 
going to be "concrete"), we want to show folks how they're actually 
going to be writing stuff down.

"http://www.example.org/index.html has a creator whose value is John Smith"

[[The "whose value is" grates. Suggest: http://... has creator ...]]

I know "whose value is" grates;  the problem is trying to construct an 
English sentence that has a roughly parallel structure with the triple 
we want to write later (and also has the same structure no matter how 
many different properties we use in examples later).  Any such English 
is going to sound odd;  we've tried several different variants on this 
theme;  none of them is really ideal.  I'm prepared to use "has 
creator", but "has" ought to be a "noise word" as far as the property 
name is concerned;  the property name ought to be "creator", it seems to me.

[[I think you can assume that the reader knows what an identifier is]]

I think so too;  the question is whether what the reader knows is what 
I'm trying to talk about here!

[[This section on URI's seems like a big barrier to the reader early on. 
I'd expect a primer to introduce stuff more gradually. In style, this is 
beginning to feel more like a text book than a primer]]

I understand.  The problem is that:

a. URIs are really fundamental;  if they don't understand that, it's 
hard to make a number of subsequent points in sec. 2.3 (e.g., about 
shared references and stuff)

b. without having introduced fragments, and without having introduced 
namespaces (in the maybe-to-be-deleted XML section), it's hard to 
introduce the QName abbreviation for triples, which means we have to 
write them all out (and the Primer was supposed to introduce this 
abbreviation).

[[Do we really need this about XML? Is a basic understanding of XML a 
requirement on the reader?]]

Maybe not, and DanC complained about that too.  On the other hand, it's 
only a page, and as I said, I need (or at least I think I do) to 
introduce the namespace stuff somewhere, and that's half of the XML 
section.  What do you suggest?

"* a predicate http://purl.org/dc/elements/1.1/creator"
[[Is that spelt correctly. I seem to recall dc properties had an initial 
capital letter.]]

They do have an initial captial letter.  But the DC Recommendation 
"Expressing Simple Dublin Core in RDF/XML" recommends using lowercase 
for everything.

[[The mathematicians are having a lot of trouble deciding exactly what 
sort of graph and RDF graph is and Pat is staying away from having to 
get that "right". ]]

I'm taking your advice on this one.  NB:  The basic problem, IMO, which 
I've not heard mentioned, is that the "graph" in mathematics is 
typically described as modeling *one* relation, not an arbitrary number, 
as we do.  What we're really doing is plunking a bunch of separate 
graphs on top of each other (using a common set of nodes).  I don't 
relish trying to explain that, however.

"These examples also illustrate that RDF uses URIrefs as predicates [[no 
they name predicates]] in RDF statements"

Actually, I think the URIrefs *are* predicates;  what they name is 
*properties*.  However, I'm going to try to avoid that somehow.

"* rows in a simple [[doesn't work for complicated ones then?]] 
relational database"

Remember that I'm talking about formats that correspond to RDF 
statements, not the information content of a collection of RDF 
statements.  The row that corresponds to an RDF statement (or triple) 
has three columns, which is pretty simple.

[["Bedford" is not the city, its the name of the city. Similarly for 
state. Suggest rename properties to cityName, and stateName. Not sure 
what to rename street to.]]

I'm commenting on this because it's listed as an "error".  If I gave a 
URI for the city instead of the literal "Bedford", that wouldn't be the 
city either, that's another name for the city.  Would you have me say 
"cityName" then too?  "cityURI"?  I understand the distinction you're 
making, but I really am trying to indicate the city;  the only way I can 
do that is by using a name.

[[Thinking about it, do we really need to keep giving triples 
representations for everything. Won't the picture of the graph suffice?]]

Well, do I need to discuss nodeIDs in RDF/XML?  If I do, I think I need 
this sort of material.  One of the reasons that triples are used so 
frequently (you'll see a lot more of them as you go further) is that for 
illustrating a quick example, it's easier to write the triples than to 
draw the graph (however, if you want pictures of graphs for everything, 
I can certainly draw them).  Also, lots of people seem to find triples 
rather straightforward;  I know I do.  Perhaps more could be made of the 
point (which I really do try to make) that what is essential is the 
*graph*, and that *pictures* and sets of *triples* are merely different 
*representations* ("depictions"?) of the graph.

[[There is confusion here about what a triple is. Is it the abstraction 
(subject, predicate, object) or is it a line of N-Triples?]]

The intention was that a triple is a specific *instance* of a subject, 
predicate, object triple;  i.e., a "stating" as opposed to a "statement" 
(to rehash that old debate).

"it would be unwise [[wrong]] to assume that blank nodes from different 
graphs having the same node identifiers referred to the same resource 
[[suggests blank nodes refer to a specific resource, which in general 
they do not.]]".

Would it be better to substitute "thing" for "resource"?


"and RDF would not see anything wrong with this." [[No, strongly 
disagree. We are talking about what an RDF procesor would do here, and 
an RDF processor that is datatype aware would barf. We need to describe 
the different behaviour to expect between software that understands the 
datatype and software that does not.]]

What I was talking about was what was visible to *RDF*, and *RDF* knows 
nothing about any datatypes, as far as I know.  I agree that we probably 
need to distinguish this from what a datatype-aware processor would do 
(assuming DanC lets us talk about "processors"), but my understanding is 
that such a processor would need to know *both* (a) what RDF defines, 
and (b) what a specific collection of datatypes defines.  Moreover, I 
don't imagine there would be "generic" datatype-aware RDF processors, 
but rather RDF processors aware of specific collections of datatypes 
(like XML Schema datatypes).  If you gave one of those processors a 
datatype that it wasn't "aware" of, I wouldn't expect it to barf if it 
got a mismatched (datatype,lexical form) pair as in this example;  or at 
least not to barf in the same way.  It might say "this isn't a datatype 
I'm able to natively process", but it wouldn't (because it couldn't) say 
"this pair makes no sense" according to this datatype.

--Frank





-- 
Frank Manola                   The MITRE Corporation
202 Burlington Road, MS A345   Bedford, MA 01730-1420
mailto:fmanola@mitre.org       voice: 781-271-8147   FAX: 781-271-875
Received on Monday, 4 November 2002 21:11:36 UTC