RE: Post format from Patrick.Stickler@nokia.com on 2001-11-07 (w3c-rdfcore-wg@w3.org from November 2001)

From: <Patrick.Stickler@nokia.com>
Date: Wed, 7 Nov 2001 13:27:37 +0200
To: bwm@hplb.hpl.hp.com, w3c-rdfcore-wg@w3.org
Message-ID: <2BF0AD29BC31FE46B7887732114404316216FD@trebe003.NOE.Nokia.com>
Will try.

Patrick

> -----Original Message-----
> From: ext Brian McBride [mailto:bwm@hplb.hpl.hp.com]
> Sent: 07 November, 2001 13:06
> To: rdf core
> Subject: Post format
> 
> 
> I note that the copy of the post in the archive:
> 
>    
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0184.html
> 
> is not the same as the version I received in my inbox.  The 
> archive version has 
> lost some of the "this was copied from a previous message" 
> structure.  I suspect 
>   this may be due to the origninal post being in HTML or some such.
> 
> Could folks please stick to using plain text.  Its kinda old 
> fashioned, but it 
> works.
> 
> Brian
> 
> 
> Patrick.Stickler@nokia.com wrote:
> 
> >  
> > 
> >     -----Original Message-----
> >     From: ext Pat Hayes [mailto:phayes@ai.uwf.edu]
> >     Sent: 07 November, 2001 04:39
> >     To: Stickler Patrick (NRC/Tampere)
> >     Cc: w3c-rdfcore-wg@w3.org
> >     Subject: RE: Subject literals
> > 
> >>     > Right. YOu need to extend the Ntriples notation 
> slightly to be able
> >>     > to fully capture the structures that can be built. 
> One proposal
> >>     > (still not adopted) is to allow nodeIds (the new 
> name for the _:x
> >>     > labels) to identify not just blank nodes but also 
> literal nodes. So
> >>     > one might write the graph I had in an earlier message:
> >>     >
> >>     > aaa ---eg:prop--->10--rdf:type--->xsd:integer
> >>     >
> >>     > could be written in Ntriples++ as:
> >>     >
> >>     > aaa eg:prop _:1:"10" .
> >>     > _:1 rdf:type xsd:integer .
> >>
> >>     Well, now I'm just gonzo confused (a common state for me these
> >>     days is seems ;-)
> >>
> >>     Exactly what is the difference between this "new"
> >>     representation
> >>
> >>       aaa eg:prop _:1:"10" .
> >>       _:1 rdf:type xsd:integer .
> >>
> >>     and
> >>
> >>       aaa eg:prop <genid:123> .
> >>       <genid:123> rdf:value "10" .
> >>       <genid:123> rdf:type xsd:integer .
> >>
> >>     aside from the fact that the literal value is now part
> >>     of the *unique* identifier?
> > 
> > 
> >     The first one has three nodes and two edges; the second 
> one has four
> >     nodes and three edges.
> > 
> > 
> >     Graphs in ascii-art, respectively (view in Courier):
> > 
> > 
> >     aaa ---eg:prop--> "10" ---rdf:type--->xsd:integer
> > 
> > 
> >     aaa ---eg:prop-->[   ]---rdf:value--->"10"
> > 
> >                        |
> > 
> >                        '---rdf:type--->xsd:integer
> > 
> > 
> >     The second graph has a blank node in the middle.
> > 
> > So labels on bNodes are just a means of compression, in the case of 
> > literals, to avoid the extra rdf:value arc?  
> >  
> > 
> > And how are labels represented in e.g. a set of triples 
> describing that 
> > compressed subgraph?
> > 
> > You'd anyway have to expand that out to some kind of arc 
> (statement) 
> > based on the nodes identity, so
> > 
> > what exactly does it buy us?
> > 
> >  
> > 
> > Sorry I missed the bNode discussions, and I don't want to open up a 
> > closed issue, I just
> > 
> > would like to at least understand the key benefits to the label 
> > representation as opposed to
> > 
> > the former anonymous node representation.
> > 
> >  
> > 
> >>     And since the label of the node is now unique, why
> >>     then not use a URI.
> > 
> > 
> >     That gets into another debate, which we have had to 
> exhaustion, and
> >     decided that literals and bnodes were to be permitted. 
> Done deal.
> > 
> > 
> >     But be careful with that 'label'. The nodeIDs in 
> Ntriples are not in
> >     the graph itself: they are just used by Ntriples to 
> keep track of
> >     which node is which in its lexicalization of the graph 
> structure.
> > 
> > 
> >>     I.e. why not just
> >>
> >>       aaa eg:prop <xsd:integer:10> .
> >>
> >>     and be done with it?
> > 
> > 
> >     Well, what's that in a graph? Is 'xsd:literal:10' a 
> node label?  
> > 
> >      
> > 
> > It's a URI, and hence a resource. Thus it's a uriref and it 
> is a label. 
> > But the typing is "built in".
> > 
> >       
> > 
> >      If so, I tend to agree, that would certainly make 
> everything a hell
> >     of a lot simpler (even if it does throw away several 
> weeks work:-).
> >     Literals wear their datatype on their sleeves, they 
> have a single
> >     globally fixed interpretation, are never ambiguous; end 
> of story.
> > 
> > Exactly. That's the point of the URV encoding. No questions 
> about data 
> > type ever, even beyond RDF space. 
> >  
> > 
> > Not that I wouldn't hate to see weeks of work thrown away ;-) 
> > 
> >>
> >>     Interpretation of literals is for applications above the RDF
> >>     space anyway, right? So why not just use a self 
> contained package
> >>     of value and type, which doesn't get munged when binding to
> >>     query variables employing inference based on subClass 
> relations?
> > 
> > 
> >     Right, good point.
> > 
> >>
> >>     > where the subject of the second triple is the same 
> nodeID as the
> >>     > object of the first one. The general rule to make a 
> graph from
> >>     such a
> >>     > document is: make a separate graph for each triple, 
> then merge all
> >>     > nodes with the same nodeID or uriref label; then 
> erase the nodeIDs.
> >>     >
> >>     > Now, the examples given above might look like this:
> >>     > _:1:"fi" rdf:type <urn:iso:3166_1> .
> >>     > _:2:"fi" rdf:type <urn:iso:639> .
> >>     > <urn:foo> xyz:someProperty _:1:"fi" .
> >>
> >>     Well, that's *alot* different than the earlier examples
> >>     which had the object nodes labled identically. This treatment
> >>     seems the same to me as the current "genid:" approach
> >>     which of course is required in order to get to triples.
> >>
> >>     Each bNode has a "system" identity, and statements are
> >>     expressed using that system identity as the subject. And
> > 
> >>     in essence, that system identity is a kind of "local URI".
> > 
> > 
> >     I'm lost. I really don't follow what you are saying here.
> > 
> > 
> >>     So your label really *is* the same as a URI, but it's
> >>     the URI of a resource node (or bNode) not the literal itself,
> >>     and properties (arcs) hung on that node are properties of
> > 
> >>     the object for that particular statement, not the literal.
> > 
> > 
> >     Think of the graph as follows: its the NODES that denote things.
> >     Nodes with a uriref label denote the resource with that uriref.
> >     Blank nodes denote things, but we don't have names for 
> them. Literal
> >     nodes (in my understanding) are like uriref nodes in that they
> >     denote through their labels, but literal labels denote 
> things by a
> >     different route than urirefs; their meaning is determined by a
> >     datatyping scheme rather than by an interpretation.
> > 
> > 
> >     Now, nodeIDs ('_:2' and so on) are not mentioned, 
> because they are
> >     not in the graph at all; they are only used by an 
> Ntriples parser to
> >     keep track of the correspondence between labels in 
> triples and nodes
> >     in the graph.
> > 
> > OK, things are becoming clearer. The literal itself does 
> not constitute 
> > the identity of the node, only its lexical representation, 
> just as a URI 
> > may represent the lexical identity of a global resource in the 
> > serialization. Right. Got it. But still, the fact that a 
> uriref label is 
> > used as a nodeID but a literal label is not, seems a little 
> messy (for 
> > lack of a more technical term ;-)
> >  
> > 
> > I guess that was why I was equating a literal label as a nodeID.
> > 
> >  
> > 
> > It's interesting that, if a URV approach were adopted 
> wholesale, then 
> > literals as they are now could be eliminated entirely, interpreting 
> > untyped content data values in XML/RDF or NTriples 
> serializations as 
> > implicit defintions of <xsd:anySimpleType:*>  I.e.
> > 
> >  
> > 
> >    <someProperty>foo</someProperty>
> > 
> >  
> > 
> > becomes <xsd:anySimpleType:foo> in the graph. If someone wants 
> > interpretation of the value according to some other data 
> type, then they 
> > have to declare it locally using an explicit URV encoding, e.g.
> > 
> >  
> > 
> >    <someProperty rdf:resource="xsd:token:foo"/>
> > 
> >  
> > 
> > and we have the locally defined <xsd:token:foo> as the 
> object node of 
> > the property in the graph.
> > 
> >  
> > 
> > Then, label would always equate to nodeID and in fact label 
> could be 
> > discarded and we'd just have nodeIDs, all of which are 
> urirefs. Eh? Or 
> > is that a bit too radical ;-)
> > 
> >  
> > 
> >     It uses fewer nodes, for one thing;  
> > 
> >      
> > 
> > Fair enough. I'm all for more efficient representations.
> > 
> >      but more significantly (IMHO), it allows the datatype 
> 'context' to
> >     be inferred from other parts of the graph by using RDFS 
> reasoning.
> >     However, I confess that the issue you have raised about
> >     inappropriate bindings has got me more worried about 
> this than I was
> >     previously.
> > 
> >  
> > 
> > Well, hopefully I've not worried you needlessly. I think 
> that the issue 
> > of RDF not providing any kind of compilation of lexical forms into 
> > canonical representations and that a descriptive interpetation of 
> > rdfs:range presuming such a canonical representation does 
> need to be 
> > addressed. Or else we will have the risk of bindings that cannot be 
> > reliably interpreted according to the inferred data type.
> > 
> >  
> > 
> > Cheers,
> > 
> >  
> > 
> > Patrick
> > 
> >  
> > 
> >  
> > 
> 
>
Received on Wednesday, 7 November 2001 06:27:57 UTC