Re: Datatypes [Was: Minutes telecon 26th July 2002] from Patrick Stickler on 2002-07-31 (w3c-rdfcore-wg@w3.org from July 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Wed, 31 Jul 2002 11:58:14 +0300
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B96D8156.19484%patrick.stickler@nokia.com>
On 2002-07-30 0:45, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com> wrote:

> 
> 
> Sumamry:
> A short piece in defence of Patrick's account of WG decisions;
> a longer piece in self-defence (against Patrick!) with a more detailed
> proposal for a "no global" (but untidy) datatyping/model theory solution.
> 
> In defense of Patrick ...
> 
> Patrick:
>>> This has already been rejected by the WG.
> DanC:
>> I don't recall any such decision.
> Patrick:
>>> The WG has already decided that datatyping should work by one of the
>>> two proposed options,
> DanC:
>> which decision are you referring to here?
> 
> I think
> http://www.w3.org/2001/sw/RDFCore/20020225-f2f/#d-2002-02-26-3
> covers both of these. Specifically committing us to a particular version of
> datatyping (supporting (A)) except in response to a particular problem with
> (D) being noted as a problem.

Yes, that does seem to cover it. But we also spent considerable time in
Bristol listing and refining a set of agreed characteristics for datatyping
and got down to consensus except for tidy  vs. untidy and agreed to
submit the inquiry to the community to help resolve that last point of
contention.

Though some WG members would have been OK with or even preferred pairs
of datatyping properties, the majority was not in favor.

It's too bad that the list we produced was not reflected in the official
minutes from the f2f. Perhaps it is somewhere in the IRC logs.

But that does not mean that there was not a decision by the WG at
the f2f not to go with pairs of properties. That decision was part
of the concensus to take the stake in the ground proposal, and
decide whether it would be realized as-is with tidy literals or
modified to have untidy literals.

Of course, if anyone feels that I was hallucinating during the f2f
and remembers differently, feel free to speak up.

> ===
> 
> In self defense ...
> 
> Patrick:
>> Sorry Jeremy, but I don't see your proposal
>> as constituting "substantial progress". It's a step backwards.
> 
> Substantial progress towards consensus may be a step backwards on some other
> measure.

True, but I guess that depends on whether you want "substantial progress"
for the solution, or simply for consensus itself. I guess I understood
the goal as being progress towards the solution (or rather towards both
the solution and concensus).

I agree that there already is consensus regarding the syntax and semantics
for local datatyping. I think we've had that for a long time. But given
the fact that the lion's share of usage employs the inline idiom with
global datatyping, I don't see how we can punt on the global issue,
particularly since choosing tidy semantics in the MT precludes any later
untidy treatment.

On the otherhand, untidy semantics in the MT does not preclude a tidy
interpretation based on string equality of literals, which could be
defined axiomatically as

IF
   _:x rdf:type rdfs:Literal .
   _:y rdf:type rdfs:Literal .
   _:x ex:stringEqualTo _:y .
THEN
   _:x ex:equivalentTo _:y .

Now, that might result in some contradictions, but hey, it still
can be done.

And the above treatment could also be implemented in an RDF API
directly, allowing folks to choose whether they employ the default
untidy datatyped equality test or the tidy stringEqual based
equality test (as has been suggested previously on the discussion
lists).

But this flexibility is only possible if untidy semantics are the
default treatment for literals.

> Patrick:
>> Of those that chose to explicitly respond as requested to Brian's
>> inquiry, there is overwhelming preference for untidy semantics
>> (almost unanimous).
> 
> Not quite as overwhelming as all that, but I would hope sufficiently so to
> make the WG reluctant to continue with tidy literal semantics.

As I qualified, for those who chose to actually provide an answer to
the question A or D, the results are overwhelmingly in favor of D.

Of those that responded to the form or content of the inquiry itself,
and proposed alternate datatyping approaches, etc. either there was
misunderstanding regarding the details of the examples (which is to
be expected, given the constraints on how much explanation can be
provided) or were proposals that were already considered and rejected
by the WG -- i.e. nothing new.

But yes, certainly strong enough that the WG should be reluctant to
adopt tidy literal semantics. Quite so.

> Patrick:
>> Most importantly, the tidy/untidy issue is as much about the model theory
> as
>> it is about datatypes
> 
> that's true. To punt on this would require more care.
> 
> 
> It won't have escaped your attention that I have been a steadfast advocate
> of untidy literals. As a result, at Cannes, I was in a minority that got
> outvoted. After this, I tried hard to work out what were the minimal changes
> needed to the WG's position to make it possible for me to no longer object.
> One of the possible changes then, for me, was to drop the global idiom.
> 
> Now, you argue (and I agree) that the tables have turned. However, i still
> see a minority with strongly felt opinions. My understanding of the
> consensus process is that we should avoid steam-rollering minorities with
> strong opinions lest they be right. (Not that they are in this case :)) A
> minority (particularly a minority of more than one) should be accomodated as
> much as possible.

While I'm very sympathetic to that view, at some point, we need to make
a decision (if we are not going to punt on datatyping, and I don't see
any reason to do so) and it may very well be that there is not unanimous
concensus.

But the stake-in-the-ground proposal modified for untidy syntax/semantics
(i.e. the previous version of it) will be a good solution for RDF
datatyping, even if not everyone's preference, and there is always the
opportunity to fix or refine datatyping in conjunction with RDF 2.0.
The untidy solution based on rdfs:range semantics fits well with our
current charter by not introducing major extensions or changes to RDF.
The tidy solution simply punts on datatyping IMO since most usage employs
the inline idiom.

We agreed at the Bristol f2f that there was one final decision to make
for the last remaining open issue -- whether we'd have tidy or untidy
literals -- and chose to send an inquiry to the community for input.

The explicit input to the inquiry is clearly in favor of untidy literals.
The majority vote at the f2f was in favor of untidy literals.

I see no reason to not ratify the choice for untidy literals and close
the last remaining open issue.

> I tried to suggest that no global idiom would be more acceptable than a
> global idiom that was unacceptable to at least three members of the WG, and
> some other members of the community.

True, for some members of the WG and community, but I don't think for
the majority of the WG or community. No global idiom is just punting
on datatyping since most folks already use and want to use the inline
idiom with property-bound datatyping, but need datatyping explicit in
the RDF. 

> Pat's second RDF Model Theory document did have a solution to the meaning of
> literals that was vague enough to satisfy everyone a bit (and nobody
> completely?).
> 
> http://www.w3.org/TR/2002/WD-rdf-mt-20020214/
> 
> Key features were:
> 
> - literal were syntactically untidy
> - no constraints were put on the mapping from literals to their values
> - the range of the XL mapping was not necessarily a subset of the set IR
> "Note that no particular relationship is assumed between IR and LV."
> 
> Since in that MT the entailment from
> 
> <jenny> <age> "10" .
> 
> to
> 
> <jenny> <age> _:x .
> 
> does not hold, it has actually remained fairly neutral on the truth or
> falsity of the (A) questions. Moreover, the (D) question can't be asked
> without a global idiom.
> 
> 
> Thus to make a more concrete proposal for just having a local idiom:
> 
> + use the Valentine's day MT
> + use the local idiom
> + for each datatype the value space is a subset of IR (this ensures that
> bNodes can match datatyped values)
> 
> For someone who wishes to view this with tidy literal semantics, they are
> free to restrict their view to interpretations in which XL is so restricted.
> 
> Is anyone interested in this?

Well, those that work with closed systems might be happy with that, but
how does that work in an unconstrained global context?

If the interpretation can differ from system to system, how does that
achieve precision and consistency of meaning in the SW?

If my interpretation is that the inline literal denotes the datatype
value, I expect that everyone else that uses my knowledge will employ
the same interpretation -- or at least know what my interpretation is.

Even taking untidy literal semantics as default -- one can still
employ a tidy interpretation by simply using string comparison alone
to test equality and disregarding the datatyping information available.

If one always wants to treat "10" as equal to "10" irregardless of
asserted datatyping, they can -- but it remains clear what the
asserted datatyping was, rather than remaining a mystery which
interpretation the system producing the statements was
employing -- tidy versus untidy.

Jeremy, we all want to see closure on this, but please let's not deviate
from what has already been agreed to date. We *are* making progress,
both towards a complete solution and to concensus. Let's stick with
the plan agreed to at the f2f and make that final decision regarding
tidy vs. untidy and get things wrapped up. Eh?

Cheers,

Patrick 

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Wednesday, 31 July 2002 04:58:11 UTC