Re: Denotation of datatype values from Pat Hayes on 2002-04-20 (w3c-rdfcore-wg@w3.org from April 2002)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Fri, 19 Apr 2002 19:16:30 -0500
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05101550b8e65b3a3a17@[65.217.30.94]>
>On 2002-04-19 2:40, "ext Pat Hayes" <phayes@ai.uwf.edu> wrote:
>
>
>>>  The purpose of *all* datatyping idioms is to identify datatype values.
>>
>>  No, really, it isn't. Dont believe me, check with Graham about DC
>>  uses of datatype checking. Not everyone is interested in values.
>
>With all due respect to Graham and anyone else, I think that anyone
>who says they are interested in constraining a literal to the lexical
>space of a datatype but not interested in datatype values has a screw
>loose (or several).
>
>>>  If an idiom doesn't have that purpose, then it is not a datatyping
>>>  idiom. If the inline idiom doesn't have that purpose, then it
>>>  should not be called a datatyping idiom.
>>
>>  That is a narrow view which not everyone shares. I started off with
>>  that position, but being on this WG kind of knocked it out of me.
>
>I never got the impression that the WG did not share that view. The
>only issue of contention I recall is whether users should be forced
>to give explicit denotation in the graph for all values, or whether
>one could express the value solely by its lexical form in the graph.
>
>In all cases, it is the datatype value that is central, otherwise,
>there's no point to datatyping.
>
>If those proponents of lexical checking weren't concerned about
>datatyping, then they shouldn't be concerned about lexical checking.

Why not?? The lexical domain is part of the definition of the 
datatype. Why should someone be forbidden from being interested in it?

>
>>>  And BTW, I consider an idiom that places a literal in the lexical
>>>  space of a datatype to have identified a datatype value. How could
>>>  it not?! So I consider the inline idiom to be a fully valid datatyping
>>>  idiom.
>>>
>>>  That *doesn't* mean that the literal node denotes the datatype value,
>>>  but that ultimately, some application should understand at some level
>>>  that the RDF graph is talking about the actual datatype value in relation
>>>  to that particular property
>>
>>  BUt look what you just said. The MT says A, and the MT is the RDF
>>  meaning spec, but we should say that an application 'at some level'
>>  should understand this to mean (not A). Thats not a coherent position
>>  to take, even if you think A is wrong.
>
>Again, you think I'm saying we should change the meaning of the
>single, specific literal node itself. I'm not.
>
>Consider it as *extra* knowledge.

OK, but that extra knowledge is NOT in the content expressed by the 
RDF . We need to say what IS in the RDF. If I read an RDF graph, I 
have no access to anything 'extra'. I have to know what the RDF graph 
actually says, as accurately as possible.

>Such that, in addition to
>saying e.g. that Judy's ex:age is "10", it's also true that,
>based on the presence of an rdfd:datatype assertion and the
>datatyping MT, Judy's ex:age is understood to be
>[ xsd:integer "10" ].

OK, but that [xsd:integer "10"] could mean several things. It could 
mean "10", checked out as conforming to the lexical form of 
xsd:integer, for example.

>Now, we don't actually need to assert that additional
>knowledge in the graph in order to provide it to some
>application, do we?
>   
>>>  , even if that datatype value is not denoted
>>>  in the graph itself.
>>
>>  No, I insist, we should NOT say that, because it isn't true. An
>>  application MIGHT do that , but it is not OBLIGED to do that.
>
>The whole point of a datatyping standard for the global interchange
>of datatyped knowledge is to *OBLIGE* every application to
>interpret the RDF graph and agree on the specific datatype value
>identified therein.

NO, that is not what is in the 'stake in the ground' proposal, and 
its not what I understood to be the consensus of the WG. That is why 
I modified the previous proposal (which seems closer to what you 
would like to have.) For example, Dan C was most particular that in a 
case like

<ex:foo> rdfd:datatype <xsd:integer>
a <ex:foo> "10" .
b <ex:foo> "010" .

that the values of <ex:foo> on a and b are *different*, because he 
wants to use string-matching to test for literal identity.

>If the present MT doesn't do that, then it is IMO insufficient.

Well, that re-opens the datatyping debate, then. That is a different 
issue, however. Im assuming that we are using the stake-in-the-ground 
MT.

>  > RDF
>>  allows an application to make up its own mind on that point. And, I
>>  might add, if it does do that *and then tries to draw RDF inferences
>>  on that basis*, it will likely get wrong results. In other words,
>>  once it has done that, it's on its own, it's out of our sphere, it's
>>  obeying its own rules, we wash our RDF hands of it. Not our problem.
>
>I both agree and disagree. I agree that we are not responsible for
>anything not asserted by the MT. But I disagree that the datatyping
>MT should not assert a specific datatype value for each idiom.

OK, then we should change the MT, if that is what people want. Feels 
a bit like picking at a healing wound at this point, though. I don't 
think there are many more places in the landscape of possibilities 
left to explore, and everywhere we have been has had some 
show-stopping property for someone.

>  >> I.e.
>>>
>>>     literal node all by itself          = literal
>  >>    literal node combined with datatype = datatype value
>>>
>>>  the latter does *not* change the meaning of the literal
>>>  itself, no more than 2 + 1 = 3 changes the meaning of 2
>>>  just because if combined with 1 we get 3.
>>>
>>>  The datatyping idioms have a meaning that is the sum of their
>>>  parts -- the literal and the datatype -- and that total meaning
>>>  does not change the meaning of their component parts.
>>>
>>>  I tried to ask a question a week ago: does a datatype value have to
>>>  have an explicit denotation in the graph, by a single node, in order
>>>  for it to be expressed/identified/communicated by the graph?
>>>
>>>  I was told, by a combination of silence and limited comments, that
>>>  no, it does not.
>>>
>>>  You seem to be arguing that it does.
>>
>>  I am saying that what an RDF expresses/identified/communicates, in
>>  some grander scheme of things, is none of our business. Our business
>>  is to be very clear what exactly it is that the various parts of an
>>  RDF graph actually *mean*, and to pin that down as accurately and
>>  unambiguously as we can. Then users can choose to conform to our spoc
>>  or not, up to them. If they don't, its not our fault if they get into
>>  a pickle.
>
>Well, that may be your business, but it's certainly not mine

Well, I would respectively suggest that you shouldn't be writing the 
spec. Nothing personal.

>, and
>I don't think is that of the WG as suggested by the charter.

I'll let Brian rule on that one.

>
>We clearly have diffferent scopes of focus.
>
>>>  That if an RDF graph is going
>>>  to express/identify/communicate a particular datatype value, that
>>>  value must have denotation in the graph, or at least explicit
>>>  definition in the MT.
>>
>>  Sure. And if the graph is NOT going to express/whatever that value,
>>  then it need not have any such denotation. And that we allow BOTH
>>  options, and that we should say so clearly.
>
>But I consider/expect that every complete idiom *does* express
>a specific datatype value, even if the inline idiom does not
>provide explicit denotation for it in the graph.
>
>Is the value 5 explicitly denoted in the equation 2 + 3?
>Is 5 identified/expressed by that equation?
>Could you sanely assume any other value is identified/expressed?
>
>I say no, yes, and no.

I'd say no, yes and yes. Particularly if you wrote the equation using 
literals, so it looked like this:

"2" + "3"

>
>It's the same with the inline idiom + rdfd:datatype.
>
>>>  I considered it to be an important and pivotal question then,
>>>  and I still do. And the answer to that question seems to be
>>>  at the root of all of the misunderstanding and disagreement
>>>  about what meaning the idioms capture and how to define the
>>>  MT to that end.
>>>
>>>  My answer to this question is no, it does not. In that a literal
>>>  node and a datatype associated with that literal node *together*
>>>  can represent/express/identify/communicate a specific datatype
>>>  value even if that value has no explicit denotation in the graph
>>>  by a single node. And I consider that to be the ultimate goal of RDF
>  >> Datatyping, to communicate datatype values.
>>
>>  Then Im sorry, but you ought to be writing a different document, with
>>  a different title: something like "Guide to using RDF in the Stickler
>>  way", or something. But it shouldn't be the spec., because the spec
>>  does not mandate only that kind of usage or restrict users to that
>>  particular goal.
>
>I disagree. And would like the input of the WG to resolve this.
>
>>>
>>>  Here's an analogy:
>>>
>>>  Datatype Idiom:       value = foo("x");
>>>  Lexical Form Idiom:   function = foo; value = *function("x");
>>>  Inline Idiom:         function = foo; *function("x");
>>>
>>>  Now, in the first two cases, the result of executing the
>>>  function 'foo' with the argument "x" is stored in the variable
>>>  'value'. In the last case, the result is not stored, but it
>>>  is still expressed.
>>>
>>>  In all cases, the function is executed and the result is
>>>  obtained.
>>>
>>>  It is exactly the case with the datatyping idioms. It it
>>>  analogous to
>>>
>>>  Datatype Idiom:       bnode = xsd:integer("10");
>>>  Lexical Form Idiom:   datatype = xsd:integer; bnode = *datatype("10");
>>>  Inline Idiom:         datatype = xsd:integer; *datatype("10");
>>>
>>>  In all cases, the lexical to value mapping is defined and the
>>>  datatype value is identified. That's the whole point of
>  >> the datatyping idioms.
>>
>>  Its one point, but not the only point possible. I might only care
>>  that something is a numeral, and not be interested in its numerical
>>  value. Perhaps I want to check that it is a valid part of a date in a
>>  certain format. Im not interested (right now) in the actual date,
>>  only that this piece of text could be a date (and then I'll send it
>>  on to the date-checker, and let him worry about the value. We will
>>  both use datatyping, but probably use different properties, or maybe
>>  he will use more information about my properties than I am using.)
>
>FOR CRYING OUT LOUD! You *CAN'T* assert that a literal is a valid
>lexical form of a datatype without executing the lexical to value
>mapping to test if a value is actually represented by that lexical
>form!

Without executing it? Of course you can. See the BNF given in the 
previous message. Look, reasoning engines do this ALL THE TIME. It is 
routine. They do things like solve equations using algebra, and never 
compute a single value. Check out MathLab for example.

>And the *ONLY* reason why you would care about having a valid lexical
>representation of a date is so that some application could *USE*
>it as a date, obtain an actual date value from it.
>
>You are talking absolute blithering nonsense here! Forgive my
>being so blunt.

I know I'm not. I think you live in too small a world. I will refrain 
from making a joke about cell phones at this point. Forgive me, in 
turn.

>Who in their right mind would actually want/need such a twisted,
>ridiculous use of RDF datatyping?!

Somewhere between 10|3 and 10|5, I'm not sure exactly. Certainly 
mathlab has many users in the hard sciences (Two of my colleagues 
here use it regularly, and this isn't a very hard-science kind of 
place.)

>
>>>
>>>  You seem to say that only the bnode idioms actually identify
>>>  the datatype value, because the value has explicit denotation;
>>>  and that the inline idiom does not identify a datatype value
>>>  but only constrains the literal to the lexical space of a datatype.
>>
>>  Right.
>>
>>>  But in order to test if the literal is a member of the datatype,
>>>  you must -- and this is crucial -- attempt to map it to a datatype
>>>  value, and see if you get an actual value!
>>
>>  Not at all. How the testing is done depends entirely on code on the
>>  other side of the API for the datatype. Some might offer a lexical
>>  check that just does a string-parse, without ever computing a value.
>>  (For example, you don't need arithmetic to check that a numeral is a
>>  numeral.) But in any case, that is all external to RDF and irrelevant
>>  to the issue.
>
>To say that a literal is a valid lexical form of a datatype, by
>whatever means, is to say that a lexical to value mapping exists,

Yes, but not that you have to actually use it or execute it.

>and is to identify the value represented by that lexical form.
>
>And the only point in saying that a literal is a valid lexical
>form of a datatype is to say that a datatype value is obtainable
>via that lexical form.

No, that is not the ONLY point.

>
>>>  And thus, to say a
>>>  literal is a lexical form of a datatype is to identify the value
>>>  that it represents, presuming it is a valid lexical form.
>>
>>  That is just obviously wrong. The BNF
>>
>>  <digit> ::= 0|1|2|3|4|5|6|7|8|9
>>  <numeral> ::= <digit>|<numeral><digit>
>>
>>  identifies lexical forms without providing any way to even refer to
>>  their values.
>
>I am not wrong. To say that "10" is a valid lexical form
>of xsd:integer, such as by a succesful parse according to
>the above BNF, and based on the defined N:1 mapping from
>lexical forms to values

No, its not. It is based on the grammar of numerals. That doesn't 
mention numbers (look and see.)

>, we know that "10" represents a
>single specific value and have thus identified that
>value.
>
>We don't necessarily say how that value will be represented
>by a given system, or even if it is representable by a given
>system, but we have narrowed down the known universe to a
>single specific value.
>
>>>
>>>  Thus, I find your intepretation of the inline idiom as *only*
>>>  constraining literals to valid lexical forms to be both
>>>  useless insofar as the practical needs of datatyping are concerned
>>>  (which is to communicate datatype values
>>
>>  You keep saying that, and it sounds more and more like someone saying
>>  that Islam is the only true religion. Sorry, but many people disagree.
>
>Well, then let the WG call me a heretic.
>
>>>  The expression L2V(I(ddd))("LLL") identifies a specific
>>>  datatype value, no?
>>
>>  Yes, so what? The condition on the inline idiom is only that this
>>  expression is *defined*; it does not refer to its value. If you
>>  prefer, we could rephrase this differently: the condition is that
>>  "LLL" is in the lexical space of I(ddd). I didn't phrase it that way
>>  purely for the sake of mathematical elegance, since the MT only
>>  assumes the existence of the L2V map, but we could say that "LLL" is
>  > in the domain of L2V(I(ddd)).
>
>Again, the only practical, rational reason for saying such a thing
>is to identify the datatype value represented by the lexical form.

No, it could be to check that the form is in some set of lexical 
forms. Some applications live their entire lives working only on 
pieces of text, checking them for conformity to various lexical 
patterns, and so on, and never even see a value. They have a right to 
exist, seems to me.

>  >>
>>>  And all three idioms define L2V(I(ddd))("LLL"), no?
>>>
>>>  And the bnode idioms further make the assignment
>>>  I(ccc) = L2V(I(ddd))("LLL"), i.e. they fix the datatype
>>>  value to the bnode, no?
>>
>>  The bnode idioms do, yes.
>>
>>>
>>>  Thus when I say that the datatyping
>>>  MT provides a datatype value interpretation for all
>>>  three idioms, I just don't get why you say that's not
>>>  correct. If that's not what the MT is saying, then
>>>  that's what the MT *should* be saying, IMHO, and
>>>  that is certainly what I thought the MT was saying
>>>  when I voted in favor of the "stake in the ground".
>>
>>  Well then evidently there was some misunderstanding. Look, if your
>>  interpretation were correct, what use would the bnode idioms have?
>>  They would only say the same as the inline idiom, but using more
>>  nodes. The whole point of all these idioms is that each says a
>>  different thing, conveys a different piece of information, but they
>>  all refer in one way or another to datatypes. They all make use of
>>  datatyping information in various ways. They all work together
>>  smoothly and are mutually consistent, but they provide a variety of
>>  ways of using the information and a variety of degrees of commitment
>>  to what is being said about ranges and properties.
>
>The only reason for the different idioms is practicality,

No, they have different *meanings*, and provide for a variety of 
uses. I guess I feel a bit like a meat vendor talking to a sausage 
maker at this point. OK, all YOU want to do is to grind it all down 
into one kind of stuff. But really, some of my customers want to 
barbecue ribs.

>whether
>datatyping is asserted locally or globally, and whether one has to
>denote the value in the graph or not (and not having to being
>motivated simply by convenience).
>
>Your interpretation that not all of the idioms are intended to
>communicate datatype values seems contrary to the whole point
>of datatyping

Well, you know, you have made it pretty clear that as far as you are 
concerned there is only one point of datatyping. But couldn't you 
admit the possibility that others might have different points than 
yours?

>(and in a practical sense, bizarre, since if it
>is clear what datatype value the idiom identifies, then even
>if the graph doesn't explicitly denote it with a single node,
>you've still unambiguously communiated that datatype value).

The issue is what the RDF says ABOUT it. If all you do is communicate 
it in some vague sense, sure. But I want it to be clear that you 
aren't saying it is Jenny's <ex:age>.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Friday, 19 April 2002 20:16:37 UTC