Re: RDF/XML Syntax problems with datatyping literals from Dave Beckett on 2002-09-02 (w3c-rdfcore-wg@w3.org from September 2002)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Mon, 02 Sep 2002 14:41:49 +0100
To: "Patrick.Stickler" <Patrick.Stickler@nokia.com>
cc: w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
Message-ID: <13174.1030974109@hoth.ilrt.bris.ac.uk>
<snip/>ing lots to reduce the size of this message

>>>"Patrick.Stickler" said:
> Dave said:
> ...
> > and forms 1),2),3) give the same triples:
> Agreed.
> ...
> Ahh, now I see where we are missing each other.
>
> I'm not proposing that. I would disallow the above. I fully agree
> that to do the above would be confusing.
> 
> A null string is not a valid lexical form. You cannot produce a
> typed literal node without some non-null lexical form.

I don't see why this is a special case.  I hate to add more special
cases to the syntax, particular.

> The RDF/XML
> 
>    <ex:prop rdf:type="http://example.org/datatype1"></ex:prop>
> 
> would still produce
> 
>     _:a <http://example.org/ns#prop> _:b .
>     _:b <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.org/datatype1> .
> 
> as it should.
> 
> But, the RDF/XML
> 
>    <ex:prop rdf:type="http://example.org/datatype1">xyz</ex:prop>
> 
> would produce
> 
>     _:a <http://example.org/ns#prop> <http://example.org/datatype1>"xyz" .

Shrieks of another special case to me; and the empty property element
case is already too complex.


> I see the production of typed literals from the above to be a two
> step process -- conceptually (though the parser of course would likely
> skip the first step in practice):
> 
> Input:
> 
>    <ex:prop rdf:type="http://example.org/datatype1">xyz</ex:prop>
> 
> Step 1, rdf:type assertion:
> 
>    _:a <http://example.org/ns#prop> _:b"xyz" .
>    _:b"xyz" <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://example.org/datatype1> .
> 
> Step 2, typed literal node compression:
> 
>     _:a <http://example.org/ns#prop> <http://example.org/datatype1>"xyz" .
> 
> This second step is only required because literals can't be subjects.

> *And* this second step only occurs *if* and *only if* there is a literal
> (not just a bnode).

<snip/>

> But, and this is the essential point, whether there is the typed literal
> node compression or not, the semantics of both the RDF/XML and the graph
> representations are *identical* in all of the above uses of rdf:type.

I'm looking at the triples generated and they are different, so as
far as users are concerned, they are different semantics.  If a
datatyping model theory makes some other entailments from these
triples, that is another question, unrelated to the syntax.

This means in essence that this introduces two ways to do datatyped
literals and we don't need yet more abbreviations in RDF/XML !


> Does the above clarification help in any way?

Well it explains further ways that this is confusing :)

<snip/>

> No matter what we decide to do, parsers will *have* to change
> to support it.

and the rdf:lType is a 5 minute change, I already tried this.  Adding
any new semantics to existing attributes, especially ones that
already have special cases, is going to take a lot longer.

<snip/>

> > * It makes the most complex parts of the grammar, more complex again:
> >     http://www.w3.org/TR/rdf-syntax-grammar/#emptyPropertyElt
> >     http://www.w3.org/TR/rdf-syntax-grammar/#propertyElt
> >
> > * It makes the grammar continue to be less context-free, you need to
> >   do even more calculations and token-lookahead to determine what is
> >   the correct grammar term to match (in propertyElt)
> 
> I still don't see how this would not also have to be done for any
> attribute whatsoever, whatever it is called, which must trigger a
> typed literal node production.

No, it just affects the parts of the literal object of the statement
as I explained later (cut out here for brevity).

<snip/>

> Fair enough, but in the latter case, the part of the literal
> structure that is being added is its *rdf:type*.

That remains to be seen.  I'd rather not call it the 'rdf:type' but
some other term so that we are clear when we are talking about the
type URI part of a datatyped literal or the property rdf:type.

I think Jan is already actioned to propose changes to the literal
substructure.

<snip/>

> So, you are proposing a new term rdf:ltype?

I was using it as a placeholder for any better suggestions; I've not
particular love of that term.  rdf:lexicalForm might be better. Too
long maybe?  Maybe the primer editors might have user-friendly suggestions?

> I still strongly feel that the introduction of a new term is avoidable
> and that the concerns about using rdf:type previously voiced were based
> on a misunderstanding about the treatment of empty data content taken
> as a null lexical form.

It is avoidable but adding a new term helps a lot here, I feel, from
the point of explaining it as well as implementing it.

<snip/>

> Dave, given the clarifications above about null literals, would you
> actually
> find it overly burdensome to support the use of rdf:type rather than
> some
> other term?

Yes and I think I have Jeremy's support in this.


My summary:

rdf:type
* attribute ambiguous for empty lexical forms OR adds a special case
* increases grammar complexity in most complex part
* hard or tricky to implement
* adds two ways to do datatyped literals, one more than necessary
* difficult to explain when rdf:type sometimes means the type URI
  part of a literal and other times, a property of a resource.

rdf:lType or any other new attribute:
* easy to implement
* new attributes are easier for older apps to recognise and ignore
* not ambiguous
* easy to explain - it sets the datatype-URI of a datatyped literal

Dave
Received on Monday, 2 September 2002 09:44:59 UTC