Re(buttal): Why I cannot live with S from Sergey Melnik on 2002-01-25 (w3c-rdfcore-wg@w3.org from January 2002)

From: Sergey Melnik <melnik@db.stanford.edu>
Date: Fri, 25 Jan 2002 11:39:57 -0800
To: Patrick Stickler <patrick.stickler@nokia.com>
CC: RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <3C51B48D.64642600@db.stanford.edu>
Patrick Stickler wrote:
> 
> OK, here are the reasons why I cannot live with S, in no particular
> order:
> 
> 1. Local and global idioms are not compatable in the same knowledge base
> without having to resort to a duality of ontolgies, one for local idiom
> and one for global idiom. Cohabitatino of local and global idioms
> is IMO not just a desiderada, but a requirement.

Proposal S [1] suggests three idioms to choose from. In S-A and S-P,
global and local typing can merrily coexist, so I don't see why the
above counts as an cannot-live-with argument. S does not mandate that
all three idioms are required to be accepted for use simultaneously by
the same application.

> 2. Allows definition of alternate notations in lexical forms (e.g. octal)
> which are not supported by the actual datatype. I consider this to
> violate the precision of the datatype definition and a bug. It also
> means that there are two means to subclass a lexical datatype, either
> by reference to a new, subordinate datatype or by localized definition
> of (unsupported) lexical notations. TDL has only one single means of
> doing so, which respects the boundaries of definition of the datatype
> creator. With S, applications (and users) must know not only datatypes
> but ontologies of notational variant properties (octal, decimal, inKg,
> inOz, etc.) rather than, as with TDL, just datatypes.

Of course, each limitation can be twisted so that is looks like a
feature. I believe, it is much easier to arrive at a stable definition
of say integers (which has remained the same for centuries) than to find
an appropriate lexical representation (roman/arabic, decimal/octal
etc.). I see great value in being able to migrate to new encodings
gracefully, without breaking the existing applications. In TDL, you'd
have to use different formats for old and new systems. This is the same
as saving your text document with MS Word 2000 and not being able to
read it with MS Word 95. That's a horror scenario.

> 3. Requires four (4) URIs for each datatype rather than just one. How
> can we ask every authority that has already defined datatypes and given
> them URI identity to now go back and mint three more URIs so that RDF
> can use them?! TDL has already shown that one URI is sufficient.

S-A and S-P can live with a single URI. In [1], different URIs were
introduced for clarity, so that a precise distinction can be made
whether lexical spaces, value spaces, or mappings are used for typing in
a given idiom.

> 4. Requires users to understand the MT to understand which URI
> variant to use in which idiom (*.lex, *.map, *.val, or *) rather
> than addressing such distinctions in the MT alone, as TDL does.
> TDL allows you to talk about those different components, but it
> does not require you to do so simply to denote the type of a
> literal so that some application knows which value it corresponds
> to.

Same as reply for (3).
 
> 5. Requires additional clarification regarding semantics and use
> of rdfs:subPropertyOf relation between properties which convey
> datatyping.

This is untrue. Semantics of rdfs:subPropertyOf remains exactly as
defined in the model theory draft.

> 6. Use of rdfs:subPropertyOf relation between datatyping property
> and non-datatyping property is valid in the absence of domain
> constraints but not valid if domain constraints defined.
> 
> E.g.
> 
>    #Bob ex:age "30" .
>    foo:integer rdfs:range xsd:integer.lex .
>    ex:age rdfs:subPropertyOf foo:integer .
> 
> implies that "30" is a member of the lexical space of
> xsd:integer, and this is correct and (even arguably)
> intuitive and efficient, avoiding the need for
> an anonymous node idiom.
> 
> But with the additional constraint
> 
>    foo:integer rdfs:domain xsd:integer.val .
> 
> which is valid for idioms using anonymous nodes, e.g.
> 
>    #Sue ex:age _:1 .
>    _:1 foo:integer "30" .
> 
> where _:1 is then inferred to denote a member of
> the value space of xsd:integer, it also means that,
> when these two graphs are merged, #Bob, from the
> example above is *also* inferred to be a member of
> the value space of xsd:integer, and thus a perfectly
> valid use becomes an invalid use.

What a discovery! If you happen to believe that cats are dogs you can
hardly expect consistency. The use that you suggested above is simply
incorrect...

> 7. Provides no additional expressive power over TDL
> yet requires significantly more machinery and deeper
> understanding of the model by users, and is not as
> compatable with current idioms as is TDL. Per Occam's
> razor, TDL is the simpler and better choice.

That sounds more like a plug rather than an argument. I think it has
been made sufficiently clear over the past several month, which of the
proposals requires "significantly more machinery and deeper
understanding of the model by users" ;)

> 8. The S model, particularly literal labeled nodes
> participating in graph tidying, precludes any later
> adoption of a P+ treatment, which is a more ideal
> local idiom than D given the consistency of representation
> of local and global typing in the graph; I.e. with P+
> the literal object node is not shifted to an anonymous
> node as with D, it just stays where it is and the local
> type is specified for the literal node directly
> via rdf:type -- elegant, efficient, and worth keeping
> as a desirable future option. S would prevent this
> option forever.

Talking about esthetics, what about this: in TDL, if you write

  _1 rdf:value "bla"

then _1 and "bla" denote exactly the same thing. Do you consider this
elegant?
 
> In short, S is too complex, too different from present
> usage, able to result in conflicting knowledge on
> merge that is valid separately, and too burdensome
> from the practical point of URI management to be
> acceptable.

Another plug... What about this: "In short, TDL is too complex, breaks
most existing applications, APIs, querying, storage, ..."? ;)
 
> If S were the only option, I'd still not choose to
> use it. I'd look for some other KR solution that dealt
> with datatyping in a more economical and user friendly
> manner. But since there is another option, TDL, which
> has none of the above faults yet meets all of the
> specified desiderada, the choice for TDL is obvious.

As a developer, I don't have an option. With TDL-RDF, all of my own
applications in mediation, model management, backend storage etc. would
technically become non-RDF applications, or applications "formally known
as complying with the deprecated RDF 1.0" ;) I don't see any practical
benefit in migrating all existing code to TDL, and I might not be a
single implementor out there.

To reiterate, I have nothing against the rdf:type/rdf:value idiom used
by TDL. In fact, I find it not bad at all, and it works perfectly fine
within the framework of the current model theory as illustrated by idiom
P in [1]. However, I strongly object against adopting untidy literals.

-- Sergey

[1] http://www-db.stanford.edu/~melnik/rdf/datatyping/
Received on Friday, 25 January 2002 14:10:20 UTC