Re: Reject change to rdf:value from Pat Hayes on 2001-11-06 (w3c-rdfcore-wg@w3.org from November 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Mon, 5 Nov 2001 19:41:16 -0600
To: Patrick.Stickler@nokia.com
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p0510102fb80ce8df9a68@[65.212.118.166]>
>  > Well, if that were indicated by a decimal then the
>>  string "10" would do it, but if it were represented by an octal then
>>  you need "12" and if you use a binary then you need "1010". There is
>>  no way to say what THE value of rdf:value is for any particular
>>  integer, until you specify what datatyping scheme is being used.
>
>No. Data type does not define lexical representation,

In my language it does.

>  and base is
>just another issue of representation, not an inherent quality of
>the value itself.

Yes, that is precisely my point above.

>I.e. (taking the lexical representations defined
>for the Scheme programming language simply to illustrate):
>
>   [ rdf:value "10";     rdf:type xsd:integer ].  (decimal)
>   [ rdf:value "#o10";   rdf:type xsd:integer ].  (octal)
>   [ rdf:value "#xA";    rdf:type xsd:integer ].  (hexidecimal)
>   [ rdf:value "#b1010"; rdf:type xsd:integer ].  (binary)
>
>In each case, the data type itself is the same, as is the value.
>Only the lexical representation changes, and in each case, the
>interpretation of that lexical representation within the context
>of that data type is explicit and reliable.

We seem to be at cross purposes. This entire discussion is about 
datatyping of literals in RDF. The motivating idea for me is XML 
datatyping, which as I understand it is defined in terms of a lexical 
space and a value space and a mapping between them. So for example 
xsd:integer is defined as a mapping from a lexical space of numerals 
into a value space of numbers, xsd:string as a mapping from strings 
to strings which I have been thinking of simply as the identity 
mapping (perhaps naively) and made-up examples such as xxd:date as 
mappings from calendar formats such as '10-10-01' into some space of 
days or time-intervals.  The idea being that a given literal may 
occur in several lexical spaces, and hence not have a determinate 
meaning until the particular datatype mapping is somehow connected 
with it; and the debate is about various proposals for how to use 
some form of RDF syntax to establish that association. (In my 
proposal, these mappings are treated much like the denotation mapping 
in the model theory. Other proposals make these mappings explicit as 
rdf properties in one way or another. ) Do you agree with this 
summary so far?

One of our communication problems has been that the bare term 
'datatype' is used in a variety of senses (sometimes for the value 
domain, sometimes for the mapping, etc.), so perhaps I had better try 
to avoid it. I have used examples like octal, decimal and so on as 
illustrative examples only to emphasize that two different datatype 
mappings may share the same value space.

>Thus, while it is likely true that one cannot interpret a literal
>value without knowledge of its data type, and a given VALUE (as
>opposed to rdf:value) may have multiple lexical representations,
>the data type itself has little if anything to do with determining
>whether any two rdf:value's correspond to the same VALUE, apart
>from expecting them to belong to equivalent or related data types.

I find this incomprehensible. What do you mean by 'two rdf:value's' ? 
If the datatype has little to do with determining the value of the 
literal, what are we all talking about?

>The issue of whether "10" "#o10" and "#xA" are the same "thing"

Please don't say things like that. Scare quotes don't tell me what 
you mean, they just suggest that you don't really believe whatever 
you are saying. They are clearly not the same strings or pieces of 
syntax, and equally clearly they all denote the same number, so there 
is no "issue" here.

>is
>precisely the same issue as to whether "5" or "00005" are the
>same "thing" or further, if "5" "00005" "5.0" "0005" or "5.0000000"
>are the same "thing".

Whether they denote the same thing, seems to me, depends on the 
datatyping mapping in use. I would certainly hope that any reasonable 
integer datatyping scheme would allow leading zeros, but I wouldn't 
want to build that assumption into RDF.

>It is IMO beyond the scope of RDF proper,
>and should be left to the interpretation of specific systems,
>based on the data type information that is provided

Right, I think we agree.

>  -- and it
>is imperative that regardless of typing and other qualifications
>associated with a given rdf:value in the graph, that the rdf:value
>itself be clearly identifiable,

?? Again I have no idea what you mean here. What do you take 'the 
rdf:value' to be, exactly? Obviously, the intended meaning of a 
literal *cannot* be determined until the relevant datatyping 
information is available; I take that to be axiomatic, or else we 
have all been arguing about nothing at all. I thought that the issue 
was how best to arrange that the datatyping information be encoded 
and associated with the literal.

>  and not hidden behind any
>specialized properties such as eg:hexidecimal or xsd:integer, etc.

These were intended to be datatyping schemes (possibly in a parallel 
universe, but we often make up examples.)

>
>ASSERTION/PROPOSAL:
>
>    In all cases of qualified anonymous node constructs, the value of
>    the property should IMO always and without exception be the
>    object of an rdf:value property defined for that anonymous node.

? Can you give an example? I can't make sense of this in RDF terms.

>    Otherwise, we turn the graph into a much more complex puzzle than
>    it need be for generalized operations based solely on RDF and
>    RDFS semantics.
>
>And as an aside, a "good" lexical representation specification will
>disallow semantically vacuous variants such as "00005" or "5.0000".
>It is interesting that, insofar as the prose provided by XML Schema,
>it defines a "good" lexical representation for data types such
>as integers, disallowing such vacuous representations; however,
>no actual regular expressions defining those representations are
>provided in the normative schema, and hence, no system using the
>"official" schema definitions can validate against any lexical
>representation, much less such vacuous representations, without
>a static, built-in knowledge of those base types -- which imparts
>a very heavy implementational burden for systems which are not
>XML Schema parsers/validators but wish to easily validate XML
>Schema simple data types according to their lexical forms. Pity...
>
>It is also important to note that XML Schema does *not* allow the
>expression of xsd:integer in anything but base 10. Hence the examples
>above such as [ rdf:value "#xA"; rdf:type xsd:integer ]. are in fact
>invalid and erroneous

Well then why in Gods name did you introduce them? You seemed to be 
arguing against my point, then you turn around and make my own point 
back at me.

>, as "#xA" is not a valid lexical form as defined
>by XML Schema for the integer type -- presuming that the assigning of a
>type xsd:integer to a literal string "#xA" is to considered equivalent
>to an XML serialized representation <xsd:integer>#xZ</xsd:integer> such
>that all of the constraints and properties of the data type xsd:integer
>(including lexical constraints) apply. Whether or not the use of
>xsd:integer as an rdf:type can transcend those lexical constraints and
>represent only the abstract numerical type for 'integers',

I don't believe that anyone has suggested this. I do not even know 
what an 'abstract numerical type' is; do you mean simply a class?

Pat Hayes
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Monday, 5 November 2001 20:41:21 UTC