Re: review of LCC documents as of 26 December 2002 from Dave Beckett on 2003-01-14 (www-rdf-comments@w3.org from January to March 2003)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Tue, 14 Jan 2003 16:41:00 +0000
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
cc: www-rdf-comments@w3.org
Message-ID: <25265.1042562460@hoth.ilrt.bris.ac.uk>
>>>"Peter F. Patel-Schneider" said:

<snip/>

> Dave said:
> > It has been noted and fixed.  
> 
> I disagree.  See below for an extensive comment.
> 
> > Generated blank node identifiers are done by the
> > generated-blank-node-id() notation which says:
> >    "A string value for a new distinct generated Blank Node
> >    Identifiers as defined in section 5.2 Identifiers." 
> 
> Aside from being grammatically incorrect, this statement is extremely
> difficult to understand, and even incorrect.  I don't read any requirement
> here that different ``calls'' to this action must result in different
> strings being returned.  It would be much better to explicitly state that
> generated-blank-node-id() returns a different string each time that it is
> called.

Please can you suggest a better phrase for that notation.

I'm unsure how 'distinct' in the existing notation comment doesn't
help with this; it implicitly requires a different blank node
indentifier (and hence string value of it) to be generated each time.


> > Blank node identifiers from rdf:nodeID are done by the bnodeid()
> > notation which says:
> >   "bnodeid(identifier := value) Create a new Blank Node Identifier Event."
> > which refers to section 5.2
> > 
> > 5.2 tells you that you do not have to use the exact blank node
> > identifier given but can use any method that retains the blank node
> > identity.
> 
> Well part of the problem is that there are no blank nodes in XML/RDF, so
> you can't rely on preserving blank node identity (really distinctness)
> to get you want you want.

The full quote from 5.2 you reference is:

  [[These identifiers may also be generated as part of the mapping
  from the RDF/XML to the graph for new distinct blank nodes. Such
  generated blank node identifiers must not clash with any blank node
  identifiers from rdf:nodeID attribute values. This can be
  implemented by any method that preserves the distinct identity of
  all the blank nodes in the graph. One method would be to add a
  constant prefix to all the rdf:nodeID attribute values and ensure
  no generated blank node identifiers ever used that prefix.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

This is in terms of 'blank node identifiers' in the RDF/XML
mapping to distinct 'blank nodes' in the RDF graph.  This is a 1-1
mapping.  There is no mention of blank nodes in RDF/XML.
The syntax's correct name is RDF/XML.

The requirement for the blank nodes to have distinct identity means
that the blank node identifiers also must have it.


> 
> > The suggested method given here is to "add a constant prefix to all
> > the rdf:nodeID attribute values and ensure no generated blank node
> > identifiers ever used that prefix." but that is not required.  For
> > example, generate-blank-node-id() could make all names "genid"+number
> > and bnodeid() could apply some different constant prefix.
> 
> This method is not allowed.  See below for more details.
> 
> > The more expensive alternative you give would be to keep all blank
> > node identifiers around and check for clashes.  There are medium-cost
> > alternatives too, such renaming any rdf:nodeID values that start
> > with "genid" to a new identifier.
> 
> This last method is also not allowed.

Quoting the same section above
  [[ This can be implemented by any method that preserves the distinct
  identity of all the blank nodes in the graph.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

these methods match this and are thus allowed.


> 		Problems with Blank Nodes and rdf:nodeID
> 		in the LCC XML/RDF Syntax Document
> 
> The handling of blank nodes is still problematic in the LCC version of the
> XML/RDF document.  
> 
> The intent is clear. ...

Excellent.


> ...  Each nodeElement that does not otherwise get a
> subject is given a blank node identifier as a subject.  The string-value of
> this blank node identifer is to be different from the string-value of every
> other blank node identifier resulting from the parsing of the RDF/XML
> document.
> 
> However, the document does not follow this intent.  
> 
> First, in section 5.2, the document only says that ``generated blank node
> identifiers must not clash with any blank node identifiers from rdf:nodeID
> attribute values.''  This allows
> 
> <rdf:RDF xmlns:rdf="..."
>          xmlns:ex="...">
> 
> <rdf:Description>
>   <ex:foo>
>    <rdf:Description />
>   </ex:foo>
> </rdf:Description>
> 
> </rdf:RDF>

Legal.

> to generate the following triple
> 
> _:x <ex:foo> _:x .

Incorrect.

On generating blank nodes identifiers:
  [[ This can be implemented by any method that preserves the distinct
  identity of all the blank nodes in the graph.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

This method of implementing generate-blank-node-id() does not
preserve the distinctiveness of the two blank nodes, is thus
inadequate and incorrect.  The partial quote you give from 5.2 does
not give all the requirements.



> Second, a blank node identifier in the linear representation of an RDF
> Graph is generated from the string-value of the subject of the event.  For
> events that come from nodeElements that have an rdf:nodeID attribute, this
> value is determined in 7.2.11 as follows
> 	If there is an atribute a with a.URI=rdf:nodeID,
> 	then e.subject := bnodeid(identifier:=a.string-value)
> >From 6.1.7 the string value of this subject is the concatenation of "_:"
> and the value its identifier accessor.
> 
> This means that 
> 
> <rdf:RDF xmlns:rdf="..."
>          xmlns:ex="...">
> 
> <rdf:Description rdf:nodeID="HI">
>   <ex:foo>
>    <rdf:Description rdf:nodeID="BYE" />
>   </ex:foo>
> </rdf:Description>
> 
> </rdf:RDF>

Legal.

> MUST generate the following triple
> 
> _:HI <ex:foo> _:BYE .

This is not required.

bnodeid() notation is [[Create a new Blank Node Identifier Event.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

links to 6.1.7 Identifiers
  http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers
  and onwards to the definition of blank node identifiers in 5.2 that says:

  [[ This can be implemented by any method that preserves the distinct
  identity of all the blank nodes in the graph.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

So any other method of implementing bnodeid() that matches this
requirement is fine.  It could rewrite them to be _:foo and _:bar if
it so desires also long as it preserves the distinctiveness of the
blank node identifiers and does not clash with any generated blank
node identifiers.


> Therefore the wording in 5.2 ``One method would be to add a constant prefix
> to all the rdf:nodeID attribute values'' is not a potential solution to the
> blank node identifier clashing problem.

Yes it is.  Quoting the entire sentence you selected from:

  [[One method would be to add a constant prefix to all the
  rdf:nodeID attribute values and ensure no generated blank node
  identifiers ever used that prefix.]]
  -- http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-syntax-grammar-20030117/#section-Identifiers

So it is a potential method; it adds different prefixes to both sets
of identifiers to ensure they don't clash.


I offered this in my previous reply to you:
[[
bnodeid() and generate-blank-node-id() allow
the identifiers to be modified by refering to 5.2.  The subject is
always generated by using the string-value accessor of the
appropriate event, which can perform any modification that 5.2
allows.

I could add something further to 6.1.7 to indicate that the
string-value may be further modified.
]]
-- http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0018.html

to see if that would improve things but you didn't pick up on it.

Suggestions for new words in some of these sections would be very helpful.

At present I don't see this as a critical comment for last call but
the wording looks like it can be improved.

Dave
Received on Tuesday, 14 January 2003 11:42:36 UTC