Re: Issue rdfms-not-id-and-resource-attr from Jeremy Carroll on 2001-10-19 (w3c-rdfcore-wg@w3.org from October 2001)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Fri, 19 Oct 2001 01:33:00 +0100
To: w3c-rdfcore-wg@w3.org
Message-ID: <3BCF74BC.E5083721@hplb.hpl.hp.com>
Jeremy:
> 1: The current spec is self contradictory.
> 2: The 'common interpretation' (by only a few members of the community) of this differs from any plausible reading of the spec.

Here is a (biased) guide to the current spec:

In section 6, where the style of the spec is precise and rigorous, as
opposed to the chatty introductory materal earlier

Para 214:
[[[
Within propertyElt (production [6.12]), the URI used in a resource
attribute identifies (after resolution) the resource that is the object
of the statement (i.e., the value of this property). The value of the ID
attribute, if specified, is the identifier for the resource that
represents the reification of the statement.
]]]

There is no indication that this is restricted to only some of the
expansions of production 6.12. 

Significantly later, after a number of unrelated issues have been
discussed (namespaces, unicode, string equality, xml:lang), we arrive at
para 229

[[[
Properties and values expressed in XML attribute form within an empty
XML element E by productions [6.10] and [6.12] are equivalent to the
same properties and values expressed as XML content of a single
Description element D which would become the content of E. The referent
of D is the value of the property identified by the XML element name of
E according to productions [6.17], [6.2], and [6.3]. Specifically; each
propertyElt start tag containing attribute specifications other than ID,
resource, bagID, xml:lang, or any attribute starting with the characters
xmlns results in the creation of the triples {p,r1,r2}, {pa1,r2,va1},
..., {pan,r2,van} where
]]]
  1. [...snip...]
  2. [...snip...]
[[[ para 232
  3. r2 is the resource named by the resource attribute if present or a
new resource. If the ID attribute is given it is the identifier of this
new resource. 
]]]
  4. [...snip...]
  5. [...snip...]




The 'common interpretation' goes
+ para 232 and para 214 contradict one another, 
+ para 232 specifically applies when the condition specified in para 229
is true
+ thus para 229 implies that para 214 doesn't hold in certain cases.


This interpretation of self-contradictory text is appropriate reasoning
when reading introductory material, when the technique of leaving some
complexity to later is often found; it is also appropriate when the
proximity of the paragraphs and the overall textual structure makes it
clear that an "if then else" structure is intended. Neither of these is
present here, and we are faced with a contradiction, a mistake, an
erratum. It is the job of this WG to fix errata.

The 'common interpretation' also links in with production 6.12. viz:


 [6.12] propertyElt    ::= '<' propName idAttr? '>' value '</' propName
'>'
                         | '<' propName idAttr? parseLiteral '>'
                               literal '</' propName '>'
                         | '<' propName idAttr? parseResource '>'
                               propertyElt* '</' propName '>'
                         | '<' propName idRefAttr? bagIdAttr? propAttr*
'/>'
 

The common interpretation is that para 214 picks out the final expansion
in this production, and in that case para 232 applies.

That is not what it says.

Consider the following rdf document:

<rdf:RDF xmlns:rdf="...">
  <rdf:Description rdf:about="http://example.org/">
    <rdf:value rdf:ID="id" rdf:bagID="BAG" />
  </rdf:Description>
</rdf:RDF>


With the common interpretation this becomes:

<#BAG> <rdf:type> <rdf:Bag> .
<http://example.org/> <rdf:value> <#id> .

The text of para 229 does not say the fourth expansion of 6.12 but it
says "each propertyElt start tag containing attribute specifications
other than ID, resource, bagID, xml:lang" The rdf:value in this example
is not such a case and hence para 232 does not apply on any reading of
M&S. Therefore the only possible reading of this particular rdf:ID
attribute is that given by para 214, i.e. reification.


Thus M&S without doubt specifies the following triples for the example:

<#BAG> <rdf:type> <rdf:Bag> .
<http://example.org/> <rdf:value> _:a .
<#id> <rdf:type> <rdf:Statement> .
<#id> <rdf:subject> <http://example.org/> .
<#id> <rdf:predicate> <rdf:value> .
<#id> <rdf:object> _:a .

OK, I suspect I am being disingenuous here.

The following paragaph, the last in section 6 is 
[[[
The value of the bagID attribute, if specified, is the identifier for
the Bag corresponding to the Description D; else the Bag is anonymous.
]]]

I don't believe there are any implementations of this last phrase
either, the writers of this spec had run out of steam by this point, and
were making mistakes. The text in this part of the spec does not have a
meaning, and we should not give it a false loyalty.



Earlier Dave articulated two options:

Dave:
> 1) rdf:ID is always allowed and reifys the 1 statement
>    <parent node URI> <propertyElt URI> <rdf:resource URI> .

> 2) rdf:ID is never allowed.  If you want to reify that
>    statement, use the expanded form rather than this abbreviation.

Brian implicitly suggests two other options:

3) re-articulate what M&S says by taking para 232 which is guarded by
para 229 as taking precedence in that case over paragraph 214.

4) articualte, for the first time ever, the 'common interpretation' of
what M&S says i.e. using the fourth production of 6.12 as the guard for
text like para 232.


I point out that none of these is backwardly compatible.

All readings of the spec, the 'common interpretation' in parsers, and
any artiuclatable approach, all differ. Whichever of these four choices
we make we break 'backward compatibility' with something.

Fortunately there are no users of this construction. (I wonder why?)
Thus the backward compatibility argument is spurious, and we should ask
why there are no users of this construction.

Two reasons spring to my mind:
+ maybe reification is less useful than was thought
+ the lack of clarity about how to use reification effectively, and
overly confusing syntactic rules act as obstacles to users,
implementators, document writers, documentation writers at all levels.



Brian said:
> We should resist the temptation to fix things that we don't like but are not
> really broken.  
We don't have a choice but to do something along the lines of (1), (2)
(3) or (4) and the argument is not about not fixing things, but about
what does the least damage to M&S and rearticulates M&S most faithfully.
I believe that (1) is that choice.

Jeremy
Received on Thursday, 18 October 2001 20:28:27 UTC