Empty elements in XML and RDF syntax

Brian, thanks for adding the empty property issue to the tracking document.
After re-reading the M&S Specification I would like to present some more
thoughts on this matter.

In XML the expression <ns:element/> is considered to be an abbreviation of
<ns:element></ns:element>. In fact, most parser APIs do not distinguish
between the two forms, and an application has no means to know which one was
actually used in the XML text.

RDF serialization syntax does not acknowledge this, and so expressions like
<rdf:RDF/> and <rdf:li/> are illegal, though they are legal XML. This different
syntax becomes a problem when RDF is supposed to be embedded in XML documents.

As discussed earlier, this also affects the triple generation of an RDF parser.
In particular, it is unclear whether the syntactically legal ns:element in

<rdf:Description rdf:about="...">
    <ns:element/>
</rdf:Description>

is supposed to represent an empty literal or empty anonymous resource. This
issue may lead to interoperability problems between different RDF parser
implementations.

The XML embedability issue can easily be resolved by an agreement:
Consider <rdf:RDF/> to be equivalent to <rdf:RDF></rdf:RDF>, which matches
RDF production 6.1 and means an empty model.
Consider <rdf:li/> to be equivalent to <rdf:li></rdf:li>, which matches
RDF production 6.30.1 with an empty "value" (more on this below).

With propertyElt this is more difficult, as the expression
<ns:element></ns:element> (assuming an empty "value") matches production
6.12.1  '<' propName idAttr? '>' value '</' propName '>'
while <ns:element/> matches production
6.12.4  '<' propName idRefAttr? bagIdAttr? propAttr* '/>'
so that RDF interpretation differs from XML interpretation here.

The problem lies with production 6.12.4, which serves both for abbreviation
and referencing. To resolve this we would need to split it into productions
with distinct purposes, such as
6.12.4  '<' propName idAttr? bagIdAttr? propAttr+ '/>'   (abbreviation)
6.12.5  '<' propName resourceAttr '/>'                   (referencing)
which would be similar to the productions for container members 6.28 to 6.30.
Then we can use the XML interpretation and match <ns:element/> to 6.12.1
(maybe 6.12.1 needs an optional bagIdAttr too ...?)

This reduces the problem to the issue of interpreting an empty byte sequence
that has to match 6.17 "value". It can be either be
6.3.1/2 "description" resulting in an empty anonymous resource, or 
6.24    "string"      resulting in an empty literal.

For this discussion I assume that there are good reasons why both empty
literals and empty anonymous resources may exist in a model. Whether this makes
sense or not is a model issue, so it should be discussed "somewhere else" and
I will say "nothing" more here. :-)

In 6.12.1 the choice is clear when the idAttr is given, then the object clearly
is an inline resource. But without rdf:ID the choice remains, same in 6.30.1.

Unless we want to treat an empty literal as *equal* to an empty anonymous
resource, we need an agreement what a parser has to generate in this case.

I prefer an empty literal here, as it cannot be expressed otherwise, while an
empty anonymous resource can always be stated as <rdf:Description/>.

Regards, Karsten

Received on Saturday, 24 March 2001 07:50:08 UTC