RE: Any use cases for untidy literals except long range datatyping? from Patrick.Stickler@nokia.com on 2002-08-12 (w3c-rdfcore-wg@w3.org from August 2002)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 12 Aug 2002 16:06:23 +0300
To: <melnik@db.stanford.edu>
Cc: <w3c-rdfcore-wg@w3.org>, <phayes@ai.uwf.edu>
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B5FBA86@trebe006.NOE.Nokia.com>
> -----Original Message-----
> From: ext Sergey Melnik [mailto:melnik@db.stanford.edu]
> Sent: 12 August, 2002 15:02
> To: Stickler Patrick (NRC/Tampere)
> Cc: w3c-rdfcore-wg@w3.org; phayes@ai.uwf.edu
> Subject: Re: Any use cases for untidy literals except long range
> datatyping?
> 
> 
> Patrick,
> 
> thanks for a quick feedback! A general comment: it seems that the 
> examples that you give below deal with long range datatyping 
> only. Are 
> there in your opinion any *other* important use cases?

Yes. Defining "constraints" for particular properties
when local typing is used. I.e., saying that ex:age
takes integer values, so if someone has

   Jenny ex:age xsd:string"10" .

then the presence of

   ex:age rdfs:range xsd:integer .

results in a contradiction/clash between the datatypes,
since they are incompatable.

Being able to define the requirements/expectations of
a given system in this manner is very useful, and allows
for testing the integrity of incoming content before
adding it to the system's knowledge base.

> Further comments below.
> 
> Patrick.Stickler@nokia.com wrote:
> 
> > 
> >>-----Original Message-----
> >>From: ext Sergey Melnik [mailto:melnik@db.stanford.edu]
> >>Sent: 12 August, 2002 13:19
> >>To: RDF Core; pat hayes; Stickler Patrick (NRC/Tampere)
> >>Subject: Any use cases for untidy literals except long range 
> >>datatyping?
> >>
> >>
> >>I'd like to restate the questions, which Jan raised recently, more 
> >>explicitly.
> >>
> >>Much of the ongoing discussion about tidy/untidy literals 
> amounts to 
> >>arguing about different readings of a given piece of RDF/XML 
> >>or NTriples 
> >>syntax. From what I can tell, both tidy and untidy literals are 
> >>implementable, so we have to pick one and wrap up.
> >>
> >>To my knowledge, untidy literals have been first suggested in the 
> >>context of long range datatyping (aka implicit/global idiom). 
> >>Specifically, untidy literals provide a shortcut for using a 
> >>bNode with 
> >>a property (two triples are essentially merged into one).
> >>
> >>Is this shortcut so fundamental that there is value of making 
> >>it part of 
> >>the spec?
> >>
> > 
> > Yes. It mirrors common practice both in programming languages
> > as in XML Schema.
> > 
> > Just as one may say
> > 
> >    int age;
> >    age = 10;
> 
> 
> Please see Grahams clarification to that point below (I think 
> he replied 
> offline, his answer is appended).
> 
> 
> > ...
> >>Is there an appealing use case for untidy literals that is not long 
> >>range datatyping (aka implicit/global idiom)?
> >>
> > 
> > Even if datatyping is not asserted for inlined, non-local typed
> > literals at the RDF level, they still have untidy semantics
> > at *some* level, and to impose tidy semantics at the RDF level
> > is IMO misleading and will conflict with common sense when
> > looking at the big picture -- at the whole stack.
> > 
> > Having tidy semantics at the RDF level and untidy semantics
> > at the application level seems a fundamental contradiction
> > that will come back to bite us (and users) again and again.
> > 
> > Take CC/PP as an example. Datatyping is globally asserted for
> > properties (at present, by the application, since it can't
> > yet be done in RDF) and the interpretation of the same literal
> > string as the value of two properties with different datatyping
> > is untidy, depending on the property datatype.
> > 
> > Now, if such literals are tidy in the RDF MT, then the MT
> > is saying that those two statements "mean" one thing at
> > the RDF level, but something else at the CC/PP application
> > level. Now, though one might wriggle around that with all
> > kinds of smoke and mirrors, I'd prefer to see the same meaning
> > at both levels, with the property values being the datatype
> > values.
> 
> 
> The above example does not address the question raised in the 
> subject of 
> this email ;) 

Well, I thought it did. Because, as I said, though we 
could [1] say that inlined literals are self denoting 
expressions -- and even in the total absence of any 
long range datatyping end up with the above conflict
in interpretation between the RDF MT and the application
layer, where RDF would say "10" == "10" in all contexts,
but the application would (rightly) interpret "10" as
denoting different things in different datatype contexts.

  [1] that's hypothetical, not a recommendation ;-)

So, even without any long range datatyping at all, tidy
literal semantics at the RDF MT level is contrary to
many applications which (again, rightly) interpret
inlined literals as contextual names and not as global
constants.

> Still a comment: in the latest proposal, there are no 
> "levels" of interpretation etc. Typed literals have a fixed 
> interpretation with consistent meaning...

But we're talking here about untyped, inline literals, right?

Please, folks, can we stop calling things such as xsd:integer"10"
a "literal". I know that has been proposed, but it's *very*
confusing. I'd appreciate it if we could consistently distinguish
between literals (i.e. untyped, implicit) and datatyped literals,
which are some locally typed instance of a literal. OK? Thanks.

> 
> >>Are we closing off any important extensibility paths if we go 
> >>for tidy 
> >>literals?
> >>
> > 
> > If literals (non-datatyped literals) are tidy, then if later
> > we allow literals to be subjects, they are far, far less
> > interesting than if they are denoting the datatype value
> > which they (contextually) represent.
> > 
> > Patrick
> 
> 
> I don't understand. If, later on, a URI scheme (or a bNode-based 
> mechanism) for RDF datatyping is introduced, it would be 
> possible to use 
> resources/bNodes as subjects. Such resources/bNodes would denote 
> literals. Could you rephrase your concern?

Well, it should be clear if you don't call xsd:integer"10"
a literal. Only "10" is a literal. If you have a datatyped literal
node, that is something else. I consider untyped literal nodes
(only the string) as being contextual names, without globally
consistent meaning. Datatyped literals on the other hand, have
globally consistent meaning, and denote the datatype value for
which the string portion is the lexical representation.

So when I talk about 'literals', I do so using the traditional
RDF meaning, not the recent proposal to expand its meaning to
include other entities.

So, excluding datatyped literal nodes such as xsd:integer, if
"10" only denotes itself, then if "10" can be a subject, you
really couldn't say anything other than its properties as a
string, such as 

   "10" length xsd:integer"2" .

You couldn't say e.g.

   _:x"10" lessThan _:y"01000" .

since they would denote the strings, not the values. I'm of
course presuming that from other statements in the graph it
is clear that 

   _:x rdf:type xsd:integer .
   _:y rdf:type xsd:integer .

But things get really interesting when we start dealing with
enumerated datatypes. Consider xsd:lang and some labels.

   xsd:lang"en" rdfs:label "English" .

   MyBook xml:lang "en" .
   xml:lang rdfs:range xsd:lang .

Now, if "en" denotes just the string, then there's not much
that can be done. But if "en" denotes the value xsd:lang"en",
then we can fetch the label and display it to some user.
   
Patrick
Received on Monday, 12 August 2002 09:07:43 UTC