abstract syntax representation of inline literals

Well it's half past three in the morning, and I can't sleep, and Patrick's 
wrong !
I blame the macchiato that I drank yesterday afternoon.

Contents:
A: the choice of lexical form of datatype value is a *comment*
B: We really have a closed set of datatypes
C: infinite variability in representation should be syntactic
D: URI case analgoy is specious
E: round-tripping argument is specious

A: the choice of lexical form of datatype value is a *comment*

We all agree that 

<rdf:Description>
  <eg:prop>2</eg:prop>
</rdf:Description>

and

<rdf:Description>
<!-- I like leading zeros. -->
  <eg:prop>2</eg:prop>
</rdf:Description>

are the same.

There is no, a priori, reason why we should not see the choice of lexical 
representation of a data value as similarly only a superficial irrelevance, 
and see graph equality test cases as holding (with say
<xsd:int>"2" == <xsd:int>"02")

B: We really have a closed set of datatypes

XSD specifies a closed set of 19 base types, all others, including user 
defined types, derive from these. The only values you can have for a user 
define type is one of the base types (or a list thereof: at most another 19 
new different types). Derived types are subsets of these (at most 38) types.
If I choose to represent an xsd:decimal "2" as an xsd:int "2" I have only made 
a comment that can be automatically verified. There is nothing that is worth 
preserving.

C: Infinite variability

In XSD there are an infinite number of ways of writing the number 2.
Some of these cosnsit of leading and trailing zeros. Others consist of 
defining a new type that directly or indirectly derives from xsd:decimal.
In RDF/XML we already have infinite variability in the choice of how to 
serialize a graph (e.g. whitespace and XML comments).
However the model theory is finite in style, and is most easily understood by 
adding triples using closure rules.

Patricks position is that the infinite set of representations of 
<xsd:decimal>"2" all are interpreted as the number 2. This means that any 
graph involving one of these will entail an infinite number of other graphs.
We would also need a closure rule in the MT of the form:

If
aaa ppp <ddd>"lll" .
is in the graph,
and
<ddd>"lll" maps to the same value as <DDD>"LLL" under xsd rules then add
aaa ppp <DDD>"LLL" .
to the graph.
RDF closure would be transformed from a fairly easy computation to a merely 
theoretical device. Fine for OWL (where we do have to worry about infinity), 
unnecessary and a mistake for RDF.

D: URI case analgoy is specious
Patrick said:
> If we are going to do this, then let's be sure that
> http://foo.com/blarg and http://FOO.COM/blarg are
> both mapped to the same URIref node too, eh?
Nowhere in RDF do we suggest any relationship at all between these two URIs. 
(Other than they will retrieve the same document - which is implicit in our 
specs)
However, any account of datatyping does say that the datatypevalue nodes in 
the graph are interpreted as the values from the value space in the model 
theory. Moreover, we know that we use the XSD rules to work out which values.
Thus, two way entailment between the graphs in the test case is at least 
implicit in Part I.
Thus <xsd:int>"2" and <xsd:decimal>"02.0" are much more closely related in our 
specs than the two URIs above.


E: round-tripping argument is specious
We already choose that a lot of things are irrelevant to round tripping (e.g. 
whitespace, order, xml comments, use of which syntactic rules).
We are free to define another thing that is not included in round tripping.

Jeremy

Received on Thursday, 12 September 2002 22:09:41 UTC