RE: new datatyping proposal from Jan Grant on 2002-08-08 (w3c-rdfcore-wg@w3.org from August 2002)

From: Jan Grant <Jan.Grant@bristol.ac.uk>
Date: Thu, 8 Aug 2002 14:21:08 +0100 (BST)
To: Patrick.Stickler@nokia.com
cc: melnik <melnik@db.stanford.edu>, w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
Message-ID: <Pine.GSO.4.44.0208081411170.3661-100000@mail.ilrt.bris.ac.uk>
On Thu, 8 Aug 2002 Patrick.Stickler@nokia.com wrote:

>
>
>
> > Jenny --ageYears--> int_5
> >
> > I(xsd:integer) = {I(int_0), I(int_1), ... }
>
> To have to define a lexical grammar for the constants of
> typed literals which further have a lexical grammar for
> the lexical form portion is not practical at all.
>
> The graph syntax representation of a typed literal should
> be the fusion of the URI denoting the datatype and the
> lexical form, the lexical grammar of which is defined
> by that datatype.
>
> While I consider this to be a tangient from what the WG
> should be focusing on regarding datatyping, I'm not opposed
> to the idea of having typed literals as an additional
> atomic element of the graph syntax -- *but* its representation
> should not require any further specification beyond the URI
> identity of the datatype and the lexical grammar of the
> datatype itself.

No; lexical grammar of datatypes belongs in parsers and serialisers. The
tokens - magically created by mathematical whim - belong in the graph
syntax, which shouldn't be too closely tied to lexical representation.
(IMHO)

> > It is the intent that at least some of these data types (like
> > integers,
> > floats and strings) correspond to data types provided by programming
> > languages and storage systems, so as to allow for efficient
> > storage and
> > retrieval of RDF.
>
> Nothing prevents that with the previous proposals. This is a
> trivial implementational issue that has no relevance to the
> abstract graph syntax.
>
> It is to be expected that APIs and other RDF applications will
> provide abstractions of datatyped values based on their idomatic
> expression in the graph. So what. That is true for all of the
> proposals that have been on the table thus far. The benefit
> asserted here is an illusion.

I disagree. There is (in my opinion, again) a distinct advantage with
this proposal, because the type of a literal is atomically tied to the
literal itself. This requires no idiomatic interpretation between a
triple store and the view presented to the user of an api. While having
types expressed as property arcs mitigates this somewhat, I've had a
stab at implemeting the datatyping idiom from the previous proposal and
it's a hell of a fag (this is despite having support for hiding all
sorts of extra information on bnodes).

The best route I could see for the older proposal was to add a slew of
extra calls to an API. This isn't something I'm averse to, however,
because "grovelling around at the triple level" is not particularly
productive (again, in my opinion).

> Whether the application interns
> based on typed literal constants, multi-node idioms, or any
> other representation, it's all the same, and no alternative
> is not significantly better or worse than any other.

> > 2. Concrete syntaxes
> > --------------------
> >
> > In the RDF/XML syntax, non-string literals are encoded in accordance
> > with the XML Schema spec as
> >
> > <propName xsi:type="URI">XML content</propName>
>
> Again, I understand xsi:type to be tied to XML Schema datatypes,
> not to arbitrary datatypes, therefore adoption of this term
> constitutes treading on other folks front lawn.
>
> Likewise, if it the case that 'int_5' is of rdf:type xsd:integer,
> i.e. that it denotes the value five, then we can just as well
> use rdf:type instead of xsi:type.

Certainly if literals ever get to sit on the blunt end of arcs, there
will be something like

	int(5) rdf:type xsd:integer .
	xsd:integer rdfs:subClassOf rdfs:Literal .

> > RDF/XML parsers provide callbacks that allow generating a compact
> > internal representation of literals that correspond to data types
> > provided by programming languages and storage systems (e.g., integers,
> > floats and strings). Similarly, the serializers provide callbacks for
> > encoding such literals in RDF/XML.
>
> These are all simply implementational issues that can also be
> done for any of the proposed datatyping idioms. We are not tasked
> to say how RDF parsers and APIs are to be implemented. There
> are many ways to optimize the internal storage of RDF expressed
> knowledge.

I'd like to hear or see a proposal for the datatyping idiom that does
this, but this isn't really necessary WG material. Maybe on
rdf-interest?


> This proposal is not offered as a solution to any fatal flaw
> in the local datatyping mechanisms outlined in the latest
> datatyping WD (the stake-in-the-ground proposal), and IMO
> is a less optimal solution that the present stake-in-the-ground
> proposal.

I find it conceptually simpler, and easier to sell to people.

> It also fails to address global typing and the relationship of
> datatyping to RDF types in general.

Agreed (I think); there's omre to fill out in this regard.


-- 
jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
Tel +44(0)117 9287088 Fax +44 (0)117 9287112 RFC822 jan.grant@bris.ac.uk
There's no convincing English-language argument that this sentence is true.
Received on Thursday, 8 August 2002 09:22:50 UTC