URIS for Literals (was: Re: referendum on httpRange-14 (was RE: "information resource"))

Chris Lilley writes:

> With the proviso that I would prefer
> 
data:text/plain;charset="utf-8",some%20percent%20escaped%20literal%20value

With the apology that I am not an RDF expert, I think this raises the 
question of whether literals are typed, or stated differently, whether a 
possible need for such typed literals justifies the headache of creating 
and maintaining the corresponding URIs. 

I think the above is a plausable way of carrying a literal which is a 
sequence of unicode chars.  I wonder whether there is any need to have a 
URI that represents the member of the type xsd:integer that has the 
numeric value 10, for example?   You then have to ask whether you are 
giving a name to a specific lexical form, to the just the value in the 
abstract (from which you can for most types infer a set of possible 
lexical forms, one of which may be canonical), or to the combination of 
all that put together. 

Background on Schema Simple Types
---------------------------------

As an informal tutorial for those not familiar with XML Schema types, the 
Recommendation [1] tells us about roughly the roughly the following about 
the xsd:Integer 12:

* Value in value space:  the abstract integer that is >11 and <13.

* Legal lexical representations include (these are Unicode char 
sequences):  '12', '012', '0012', '00012'

* Canonical lexical representation:  '12'  (note that canonicals can be 
inferred from the value...if you know one you know the other)

As it happens, you also know from the type name that the integer values 
are a subset of the values found in the value space of the xsd:decimal 
type.   That type has the number 12 as well as decimal numbers such as 
12.4.  Note that the recommendation makes clear that, because xsd:Integer 
is a restriction of xsd:Decimal, the value space of integer is a subset of 
the value space of decimal;  any check that implicates identity rules for 
members of the type should find an integer 12 identical to a decimal 12. 
Conversely, the values denoted by the xsd:integer lexical form "12"  and 
the xsd:string lexical form "12" are by definition different.  Neither 
type derives from the other.

URIs for Simple Types?
----------------------

Without trying to get into the nuances of exactly how to structure these 
URIs, I've long wondered whether RFD would benfit from a suite of URIs 
along the lines of:

        http://www.w3.org/2004/SchemaSimpleTypes/Integer/value/12

Maybe or maybe not you'd want:

        http://www.w3.org/2004/SchemaSimpleTypes/Integer/lexical/012

If you don't want to bother with URIS for lexical forms, then the first of 
these might be more simply:

        http://www.w3.org/2004/SchemaSimpleTypes/Integer/12

Again, my point is not to propose particular forms for the URIs, but to 
raise the question of whether there should be URIs for typed values and/or 
typed lexical forms.  As a non-RDF expert I was always surprised that RDF 
did not adopt an approach like this, perhaps with some syntactic sugar if 
needed to make serializations more convenient.

With the above in hand, you could make RDF statements like:

        http://www.w3.org/2004/SchemaSimpleTypes/Integer/value/12

is greater than:

        http://www.w3.org/2004/SchemaSimpleTypes/Integer/value/11

I don't think you'd want to make quite that same assertion about:

        data:text/plain;charset="utf-8",12

and 

        data:text/plain;charset="utf-8",11


I'm not pushing this 'URI for typed literals' idea, except to suggest that 
it's worth exploring.

Noah

[1] http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/
[2] http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#value-space
[3] http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#lexical-space

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Friday, 29 October 2004 16:41:58 UTC