Re: Proposal for abstract syntax representation of inline literals (was Re: weekly call for agenda items) from Patrick Stickler on 2002-09-12 (w3c-rdfcore-wg@w3.org from September 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Thu, 12 Sep 2002 21:29:01 +0300
To: "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "RDF Core" <w3c-rdfcore-wg@w3.org>, "ext Brian McBride" <bwm@hplb.hpl.hp.com>
Message-ID: <000e01c25a8a$44daeca0$9782720a@NOE.Nokia.com>
[Patrick Stickler, Nokia/Finland, (+358 50) 483 9453, patrick.stickler@nokia.com]


----- Original Message ----- 
From: "ext Brian McBride" <bwm@hplb.hpl.hp.com>
To: "Patrick Stickler" <patrick.stickler@nokia.com>; "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>; "RDF Core" <w3c-rdfcore-wg@w3.org>
Sent: 12 September, 2002 18:12
Subject: Re: Proposal for abstract syntax representation of inline literals (was Re: weekly call for agenda items)


> At 16:56 12/09/2002 +0300, Patrick Stickler wrote:
> 
> [...]
> 
> 
> >What about round-tripping? Does the application then just choose
> >any suitable datatype and lexical representation as it likes,
> >rather than the original pair it recieved? What if the RDF/XML
> >is being returned to the system it originated from, and a
> >datatype that is not recognized or supported by the originating
> >system is used?
> 
> This is a good point Patrick and seems like a pretty general question, 
> independent of our choice of abstract syntax.  If an application is 
> serializing a graph for transmission to another application, how does it 
> choose what datatype representation to use.  There are two assumptions in 
> Patrick's statement here that I'm not sure are valid:

Actually, I make neither of these assumptions... I guess I was (as usual)
woefully unclear...

>    o  that all information in a model will have been de-serialized from 
> some serialization with specific datatypes used

If the information hasn't come from RDF/XML, then it's not round-tripping,
is it?

If the values originate within the given system, then serializing
them will be the first time they are expressed in RDF/XML and
hence with a datatype and lexical form.

But this is missing the point I think. The point is that, although
a given application is free to do whatever it wants to the RDF
knowledge it recieves, including mapping terms used in the
original expression to other terms or representations, the 
abstract syntax itself should respect, preserve, and reflect
the original language of expression.

Having datatype+lexicalform labels mapped to actual values
in the abstract syntax (a) fails to reflect the original
language of expression, (b) presumes that round-tripping
is not important (by discarding the information necessary
to accomplish it).

Whether or not a given application does these things is
irrelevant. It might. It might not. If an application discards
the original terms of expression and thus cannot provide
round-tripping, well, cest la vie. But the abstract syntax
should not *encourage* such behavior from applications.

A given application need not operate based on the original language
of expression. It may choose to augment it in favor of interned
values. But the abstract graph, the *standard* model of 
representation for that knowledge, which that application
is intended to capture and operate on, should not casually
discard the original language terms if it ever intends to 
re-serialize them and return them to their source.

>    o  that an application receiving information represented as rdf/xml will 
> understand the same set of datatypes as the one that provided the 
> information in the first place.

I do not assume this. No more so than I assume that
all RDF applications will understand all terms of all vocabularies
used to express RDF statements.

We really need to be careful to keep separate the abstract graph
from the application specific representation, which I feel is
getting all mixed up here.

ASSERTION:  The abstract graph should capture the statements made 
            in the RDF/XML, in the original language of those 
            statements.

Whether some application, based on special knowledge, chooses
to equate various terms and merge nodes or do other modifications
to its own internal representation accordingly is completely
irrelevant to our specification of the abstract graph. An application
may very well choose to transpose the scheme and domain name
prefixes of all http: URIs so as to merge cases such as <HTTP://FOO.COM...>
and <http://foo.com...> into a single node, but the abstract
graph does nothing of the sort. Nor should the abstract graph
itself reflect any kind of semantics-based merging of e.g.
<xsd:integer>"10" and <xsd:integer>"010" etc. even *if* some
particular application chooses to do so in its own internal
representation.

Again, the abstract graph should capture the statements as
they were expressed, and not merge or modify those expressions
based on extra-RDF semantics (and merging equivalent datatype
values *is* based on extra-RDF semantics).

> We could help to make the second of these true with some strong words of 
> encouragement to use particular datatypes, e.g. xml/schema.
> 
> [[When serializing an RDF graph, the use of xml/schema datatypes is 
> recommended for representing datatype values where the datatype 
> capabilities of the receiving system are not known.]]
> 
> How would folks feel about such a strong endorsement of schema datatypes?
> 
> [...]

Sorry to seem to be nothing but a naysayer, but I would
not be comfortable with any such recommendation.

I believe that the language of original expression should
be respected and preserved by the abstract graph, and
unless applications have a very good reason to do otherwise,
should also be respected and preserved on re-serialization.

> 
> 
> >I believe that having values as labels in the abstact graph will
> >introduce portability and interoperability problems between
> >applications as well as misunderstandings and conflict between
> >application developers insofar as different applications/platforms
> >are able to natively interpret and represent different sets of
> >datatyped literals.
> 
> Would you like an action to analyze this issue and document those problems 
> in more concrete terms.

Is that a ploy to shut me up ;-)

If a significant body of the WG feels that the proposal to label 
typed literal nodes with actual values has sufficient merit and should be
seriously considered, then I would would be prepared to do this.

As to what you mean by "concrete terms", nothing I would present
would be anything but hypothetical.

Patrick
Received on Thursday, 12 September 2002 14:29:05 UTC