Re: review of LCC documents as of 26 December 2002 from Peter F. Patel-Schneider on 2003-01-14 (www-rdf-comments@w3.org from January to March 2003)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 14 Jan 2003 12:40:37 -0500 (EST)
To: dave.beckett@bristol.ac.uk
Cc: www-rdf-comments@w3.org
Message-Id: <20030114.124037.68543184.pfps@research.bell-labs.com>
From: Dave Beckett <dave.beckett@bristol.ac.uk>
Subject: Re: review of LCC documents as of 26 December 2002 
Date: Tue, 14 Jan 2003 16:41:00 +0000

> >>>"Peter F. Patel-Schneider" said:
> 
> <snip/>
> 
> > Dave said:
> > > It has been noted and fixed.  
> > 
> > I disagree.  See below for an extensive comment.
> > 
> > > Generated blank node identifiers are done by the
> > > generated-blank-node-id() notation which says:
> > >    "A string value for a new distinct generated Blank Node
> > >    Identifiers as defined in section 5.2 Identifiers." 
> > 
> > Aside from being grammatically incorrect, this statement is extremely
> > difficult to understand, and even incorrect.  I don't read any requirement
> > here that different ``calls'' to this action must result in different
> > strings being returned.  It would be much better to explicitly state that
> > generated-blank-node-id() returns a different string each time that it is
> > called.
> 
> Please can you suggest a better phrase for that notation.
> 
> I'm unsure how 'distinct' in the existing notation comment doesn't
> help with this; it implicitly requires a different blank node
> indentifier (and hence string value of it) to be generated each time.

generated-blank-node-id could be defined something like

	Return a string that is a valid blank node identifier, different
	from any string produced by generated-blank-node-id() on the input
	document so far.

This is not ideal, but is much better than what is there now.


> > > Blank node identifiers from rdf:nodeID are done by the bnodeid()
> > > notation which says:
> > >   "bnodeid(identifier := value) Create a new Blank Node Identifier Event."
> > > which refers to section 5.2
> > > 
> > > 5.2 tells you that you do not have to use the exact blank node
> > > identifier given but can use any method that retains the blank node
> > > identity.
> > 
> > Well part of the problem is that there are no blank nodes in XML/RDF, so
> > you can't rely on preserving blank node identity (really distinctness)
> > to get you want you want.
> 
> The full quote from 5.2 you reference is:
> 
>   [[These identifiers may also be generated as part of the mapping
>   from the RDF/XML to the graph for new distinct blank nodes. Such
>   generated blank node identifiers must not clash with any blank node
>   identifiers from rdf:nodeID attribute values. This can be
>   implemented by any method that preserves the distinct identity of
>   all the blank nodes in the graph. One method would be to add a
>   constant prefix to all the rdf:nodeID attribute values and ensure
>   no generated blank node identifiers ever used that prefix.]]
>   -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers
> 
> This is in terms of 'blank node identifiers' in the RDF/XML
> mapping to distinct 'blank nodes' in the RDF graph. This is a 1-1
> mapping.  There is no mention of blank nodes in RDF/XML.
> The syntax's correct name is RDF/XML.
> 
> The requirement for the blank nodes to have distinct identity means
> that the blank node identifiers also must have it.

But there are no blank node identifiers or blank nodes in the RDF/XML
document.  The point is that blank node identifiers need to be generated
during the processing of RDF/XML.  Further, blank nodes themselves are not
a part of the processing of RDF/XML at all.  Appealing to the distinctness
of something that neither exists nor is produced doesn't provide any
grounding.  

I suppose that you could introduce the notion of an RDF/XML
node element that does not have a uri subject and then talk about mapping
these into distinct blank node identifiers, but you haven't done this.

> > > The suggested method given here is to "add a constant prefix to all
> > > the rdf:nodeID attribute values and ensure no generated blank node
> > > identifiers ever used that prefix." but that is not required.  For
> > > example, generate-blank-node-id() could make all names "genid"+number
> > > and bnodeid() could apply some different constant prefix.
> > 
> > This method is not allowed.  See below for more details.
> > 
> > > The more expensive alternative you give would be to keep all blank
> > > node identifiers around and check for clashes.  There are medium-cost
> > > alternatives too, such renaming any rdf:nodeID values that start
> > > with "genid" to a new identifier.
> > 
> > This last method is also not allowed.
> 
> Quoting the same section above
>   [[ This can be implemented by any method that preserves the distinct
>   identity of all the blank nodes in the graph.]]
>   -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers
> 
> these methods match this and are thus allowed.

But this information contradicts the specific information in Section 7.
(See below.)

> > 		Problems with Blank Nodes and rdf:nodeID
> > 		in the LCC XML/RDF Syntax Document
> > 
> > The handling of blank nodes is still problematic in the LCC version of the
> > XML/RDF document.  
> > 
> > The intent is clear. ...
> 
> Excellent.
> 
> 
> > ...  Each nodeElement that does not otherwise get a
> > subject is given a blank node identifier as a subject.  The string-value of
> > this blank node identifer is to be different from the string-value of every
> > other blank node identifier resulting from the parsing of the RDF/XML
> > document.
> > 
> > However, the document does not follow this intent.  
> > 
> > First, in section 5.2, the document only says that ``generated blank node
> > identifiers must not clash with any blank node identifiers from rdf:nodeID
> > attribute values.''  This allows
> > 
> > <rdf:RDF xmlns:rdf="..."
> >          xmlns:ex="...">
> > 
> > <rdf:Description>
> >   <ex:foo>
> >    <rdf:Description />
> >   </ex:foo>
> > </rdf:Description>
> > 
> > </rdf:RDF>
> 
> Legal.
> 
> > to generate the following triple
> > 
> > _:x <ex:foo> _:x .
> 
> Incorrect.
> 
> On generating blank nodes identifiers:
>   [[ This can be implemented by any method that preserves the distinct
>   identity of all the blank nodes in the graph.]]
>   -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers
> 
> This method of implementing generate-blank-node-id() does not
> preserve the distinctiveness of the two blank nodes, is thus
> inadequate and incorrect.  The partial quote you give from 5.2 does
> not give all the requirements.

Although the intent of the other requirements can be determined, they are
sufficiently poorly written that the above mapping will satisfy
them.  

In particular, the above mapping satisfies the requirement that all the
blank nodes in the RDF graph have distinct identity.  In fact, there is no
way of violating the distinct identity of any blank node in an RDF graph.

> > Second, a blank node identifier in the linear representation of an RDF
> > Graph is generated from the string-value of the subject of the event.  For
> > events that come from nodeElements that have an rdf:nodeID attribute, this
> > value is determined in 7.2.11 as follows
> > 	If there is an atribute a with a.URI=rdf:nodeID,
> > 	then e.subject := bnodeid(identifier:=a.string-value)
> > >From 6.1.7 the string value of this subject is the concatenation of "_:"
> > and the value its identifier accessor.
> > 
> > This means that 
> > 
> > <rdf:RDF xmlns:rdf="..."
> >          xmlns:ex="...">
> > 
> > <rdf:Description rdf:nodeID="HI">
> >   <ex:foo>
> >    <rdf:Description rdf:nodeID="BYE" />
> >   </ex:foo>
> > </rdf:Description>
> > 
> > </rdf:RDF>
> 
> Legal.
> 
> > MUST generate the following triple
> > 
> > _:HI <ex:foo> _:BYE .
> 
> This is not required.

How can a change in the identifier be reconciled with 
	bnodeid(identifier:=a.string-value)
particularly in light of 
	A:=B	Assigns A the value B.
As far as I can see
	bnodeid(identifier:=a.string-value)
creates a Blank Node Identifier Event whose identifier is a.string-value.

This notation is used extensively for the other events, such as plain and
typed literals.  Each time the accessor then returns the value assigned.

[...]

> At present I don't see this as a critical comment for last call but
> the wording looks like it can be improved.

At present I still view the method for translating from RDF/XML to RDF
Graphs to be broken.

> Dave

Peter F. Patel-Schneider
Bell Labs Research
Lucent Technologies
Received on Tuesday, 14 January 2003 12:40:55 UTC