Literal subjects can be tidy (was: Re: Use case for tidy literal subjects (ignore if not interested))

>During the datatyping followup after Friday's telecon, I made
>the statement that if literals were subjects which took datatyping
>properties, they must be untidy -- and I was told this was incorrect.

Then we were at cross purposes.

There are two issues here: whether literals can be subjects, and 
whether a literal node can have a datatype-sensitive meaning. The 
first is a simple syntactic issue, and the answer is yes, literals 
can be subjects, and that does not in itself require untidy literal 
nodes; which is what I thought we were talking about in the telecon. 
The other question is different, and there the answer indeed is that 
if literal nodes can have multiple denotations which are sensitive to 
datatyping information, then they cannot be tidy, for the kind of 
reasons you give in your example below.  But this arises even if we 
do not allow literals as subjects, as in the old P idiom (updated 
here to use drange):

aaa age "10" .
age rdfs:drange decimalInt .

So, to repeat, the two issues are separate; and when separated, it is 
the latter, not the former, that requires untidy literals. I thought 
that we had decided the latter issue some time ago in favor of 
tidiness.

I suggest that we stick with the decision to have literals always 
denoting themselves, under all circumstances, as a rock-solid 
principle. Literals are *never* contextual in their meaning; all the 
datatyping cleverness applies to the bnode at the other end of the 
triple. Then we can allow literals to be subjects with equinamity, 
and still retain tidy literal nodes in the syntax. Your example then 
has a datatyping clash in the merged graph, in just the same way as 
if you had used bnodes rather than literals. However, the first graph 
is nonsensical on its face, since the triple

  "30" rdf:dtype decimalInt .

says that the string '30' is in the *value* space of decimalInt, 
which is presumably false, and would be immediately rejected by a 
datatype checker.

Pat

-------------

>
>Here is a use case which illustrates that I was, in fact, correct:
>
>I presume that literals can be subjects both in the graph and
>that this is supported by the serialization (RDF/XML, N3, etc.)
>
>Consider the following two graphs, defined separately, which are
>subsequently merged, and all literal nodes are tidy and hence
>the same.
>
>   Bob age "30" .
>   "30" rdf:dtype decimalInt .
>
>and separately, elsewhere
>
>   WidgetX gspec "30" .
>   "30" rdf:dtype octalInt .
>
>and then a query on the merge of the two graphs
>that asks what Bob's age is
>
>   Bob age ?y .
>
>where ?y is bound to "30".
>
>Hmmmm, says the app. I need a value, let's get the
>datatyping information for that literal:
>
>   ?y rdf:dtype ?x .
>
>where ?x is bound both to decimalInt and octalInt.
>
>Well, the app doesn't recognize the decimalInt datatype
>but does know about octalInt, so it can use that to get
>to a value, thus the pairing ("30", octalInt) = 36, so
>
>   Bob's age = 36
>
>Wrong!
>
>--
>
>Now, if literals were untidy, we'd get the following
>similar, though more useful scenario
>
>   Bob age _:1:"30" .
>   _:1:"30" rdf:dtype decimalInt .
>
>   WidgetX gspec _:2:"30" .
>   _:2:"30" rdf:dtype octalInt .
>
>and then a query that asks what Bob's age is
>
>   Bob age ?y .
>
>where ?y is bound to _:1"30".
>
>Hmmmm, says the app. I need a value, let's get the
>datatyping information for that literal:
>
>   _:1:"30" rdf:dtype ?x .
>
>where ?x is bound only to decimalInt.
>
>Well, again, the app doesn't recognize the decimalInt datatype
>so it exits with a message that it is unnable to determine the
>actual value of Bob's age, but it was at least able to determine
>that it is at least whatever value is denoted by the pairing
>("30", decimalInt).
>
>Now, that's both more useful *and* more reliable.
>
>And of course, if the application does know what a
>decimalInt datatype is, then it can get the value
>reliably -- the correct value -- with no risk of
>misinterpretation due to loss of datatyping context
>because literals are tidy.
>
>Eh?
>
>--
>
>Saying "this literal *might* be used to denote a decimalInt"
>is virtually useless. So what. How does that help any application
>that needs to know *what* it actually denotes in a given
>context? It doesn't. It's useless knowledge in nearly all
>cases.
>
>Thus, I assert again, if literals are to be subjects that
>bear datatyping -- or any other occurrence/context specific
>knowledge -- then they *must* be untidy. If they are tidy,
>then the *only* knowledge that can be attached to them is
>trivial knowledge about the nature of the string itself, and
>nothing about its meaning, use, significance, etc.
>
>That's not saying that tidy literals can't, technically, be
>subjects -- but if they are tidy, they have little to no
>utility as subjects. It becomes simply an academic exercise
>with little to no value for actual implementations, *and*
>if it is not clear that all properties defined for a literal
>are global, not specific to any context, logical errors
>such as the first case above can still arise, thus the
>anemic utility of tidy literal subjects must be made crystal
>clear in some primer, spec, etc.
>
>Hopefully the above clearly communicates what I was unnable
>to communicate during the concall.
>
>Cheers,
>
>Patrick
>
>--
>               
>Patrick Stickler              Phone: +358 50 483 9453
>Senior Research Scientist     Fax:   +358 7180 35409
>Nokia Research Center         Email: patrick.stickler@nokia.com


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Monday, 18 February 2002 21:22:20 UTC