RE: ACTION 2001-11-02#02: Datatyping use-cases from CC/PP from Patrick.Stickler@nokia.com on 2001-11-07 (w3c-rdfcore-wg@w3.org from November 2001)

From: <Patrick.Stickler@nokia.com>
Date: Wed, 7 Nov 2001 15:23:45 +0200
To: bwm@hplb.hpl.hp.com
Cc: phayes@ai.uwf.edu, w3c-rdfcore-wg@w3.org
Message-ID: <2BF0AD29BC31FE46B788773211440431621704@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Brian McBride [mailto:bwm@hplb.hpl.hp.com]
> Sent: 07 November, 2001 14:43
> To: Stickler Patrick (NRC/Tampere)
> Cc: phayes@ai.uwf.edu; w3c-rdfcore-wg@w3.org
> Subject: Re: ACTION 2001-11-02#02: Datatyping use-cases from CC/PP
> 
> 
> 
> 
> Patrick.Stickler@nokia.com wrote:
> [...]
> 
> 
> >>
> > 
> > Well, I'm presuming that if no rdfs:subClassOf statements
> > were encountered which define that relation prior to 
> > testing the range constraint, that one could not know for
> > sure that the type absolutely was not a subtype, ever,
> > but insofar as the knowledge present at hand, it would not 
> > be and the test of the constraint should fail.
> 
> 
> Or you could just infer that the value must be of that type as well.

Danger. Danger. 

One might, but not in any *commercial* system where data 
integrity is paramount. I want to *know* that the value
is a decimal encoding of an integer before I treat it as
such. If it might be hexidecimal, or even binary (e.g.
"10101010" what is it, decimal or binary?) I'm not going
to use it.

And if I don't have enough information to know what it is
or should be, then that's equivalent to being invalid. No?
Or at the very least "undefined".

> Validation is kinda tricky in RDFS.  If I understand it 
> correctly, RDFS has no 
> means of expressing negation, so can't express a contradiction.
> 
> You can define validation to mean that there is specific 
> confirmation that all 
> domain and range constraints are met.  I guess that is what 
> you had in mind. 

Yes.

> RDFS does not currently define any concept of validation.

Perhaps not a formal one, no, but hopefully one can test,
at any given point in time, if all constraints for a given
subgraph are satisfied. Otherwise, what's the point of
defining constraints?

> > If I later load some schema that says that foo:bar is a 
> > subclass of e.g. xsd:decimal, great, now the constraint is
> > satisfied and I know both that the value is a valid xsd:integer 
> > (since xsd:decimal is a subClassOf xsd:integer) and I know that 
> > the lexical form follows decimal notation. OK, now I can use it.
> > 
> > It's all about having the information needed to interpret
> > the lexical form. Some of that information is local. Some is
> > in the schema. Both are needed.
> 
> 
> In the case of the P proposal that is true, because RDF/XML 
> does not allow 
> literals as subjects.  It is not true in the case of the S or 
> X proposals which 
> permit the information needed to represent a lexical form to 
> represented without 
> a schema.
> 
> It really helps understand these points if we are specific 
> about which proposal 
> we are talking about.

No, not at all. The same is true for all three proposals. You
must have range constraints defined in the schemata to know
what the value types must be, and you must have locally defined
types to know what the value types are, and you likely must have
schemata which define relations between the data type classes. 
And whether the local types belong to the type inheritance chain 
(or tree) defined by the schemata determines if the constraints
are satisfied.

All three proposals only have to do with how type is associated
with literal values, not *whether* type should be associated.

My assertion is that, regardless of how type is associated with
literal values, if it is *not* associated locally, then range 
constraints are meaningless, and it is not possible to use range 
constraints to determine the type of literals reliably because 
lexical forms are not part of that determination and may result 
in conflict and error.

So there are really two separate issues here:

1. How do we associate type with literals in a way that is
   sufficiently persistent that we don't loose that information
   during processing in RDF space (e.g. inferred bindings)
   and which minimizes graph real estate

   There are (at least) three proposals: P, S, and X

2. Does rdfs:range have a descriptive purpose at all or is
   it only prescriptive in the presence of locally defined
   types.

   This issue is the same for all three proposals.

Cheers,

Patrick
Received on Wednesday, 7 November 2001 08:24:16 UTC