RE: Datatyping: moving away from "literal as 3-part thing" to "literal as dt+opaque bit"

> -----Original Message-----
> From: ext Jan Grant [mailto:Jan.Grant@bristol.ac.uk]
> Sent: 02 September, 2002 17:15
> To: RDFCore Working Group
> Subject: Datatyping: moving away from "literal as 3-part thing" to
> "literal as dt+opaque bit"
> 
> 
> 
> OK, I've already said this in the DT review, but at the last telecon I
> was asked to expand on this a bit.
> 
> The current DT proposal document (at my last review of it) takes the
> view that a literal (that is, a typed literal) is of the form:
> 
> 	(datatype, unicode string, language, xml? bit)
> 
> [Patrick said, at the telecon, "xml:lang infects everything" as an
> example of this view]
> 
> I would much rather take the following view: literals carry 
> their type;
> RDF "inherits" some literal types from its early incarnations (some of
> which may or may not be half-baked), and RDF/XML provides syntactic
> sugar to express literals of these types. There should be no 
> "infection"
> of new types by stuff like language properties, xml:base, etc.
> 
> That is,
> 
> 	<eg:property xml:lang="en">fiddle</eg:property>
> 
> is simply syntactic sugar for what is *effectively*
> 
> 	<eg:property 
> rdf:BLAHtype="xxx:langstring">("fiddle","en")</eg:property>
> 
> and typed literals should simply be considered to be the pair
> 
> 	(datatype, opaque)
> 
> where in this case, the opaque bit would be the pair 
> ("fiddle", "en) and
> the datatype would be URI of "language-tagged unicode strings".

This is the view taken by the present restructured document. I.e.

Section 1.5 defines literals (all literals, typed or untyped) as
3-tuples, having string, xmlbit, and lang.

Section 2.3 defines a typed literal as a pairing of datatype
and a literal, where the structure of the literal is opaque to
the datatyping machinery and semantics (even if available to
applications for other purposes).

I will try to make this clearer.

> That is, move away from a DT document that enshrines the cruft of
> previous specs, while having a legal and consistent way of thinking
> about it; and reject the notion that "xml:lang" values infect every
> datatype (which is not impossible - it's the realm of the 
> parser writer
> to sort that out); 

Sorry. That's not what I meant. My fault for using the overly
broad term "everything". I meant that xml:lang infects every 
*literal*.

xml:lang has no affect whatsoever on any datatype semantics and
any specified xml:lang values for a literal are irrelevant to
the datatyping machinery.

> similarly, reject the notion that non-xml typed
> literals *must* carry an xml:base.
>
> In fact, I'd say that xml:base might well be inherited by xml 
> literals,
> but by nothing else. We've support for those kinds of literals in the
> RDF/XML syntax already.

My understanding was that non-xml literals are not in any way
affected by xml:base because they are not XML, but err literal
strings.

And how xml:base is applied to XML literals is a matter of how
parseType="Literal" is handled by the parser and how that XML
literal is mapped to a canonical XML representation. Right?

> 
> In summary: I'd just like to see a cleaning up of the *conceptual*
> notion of a literal, so the older baggage (lang tagging, etc) doesn't
> become centrally written into the MT and so on. The notion of typed
> literals gives us an "out" in this regard: RDF/XML introduces some
> "built-in" types to RDF, together with an XML shorthand for 
> representing
> them.

I'm sorry, but I'm not entirely sure what you mean by "built-in" types
here.

I don't think it is useful to treat xml:lang attribution the same
as datatyping, since it exists in addition to datatyping. Rather,
we should just leave it as residual information from the RDF/XML
serialization which some applications may wish to use, but otherwise,
it has nothing to do with datatyping in any way and xml:lang
values do not assert some kind of "built-in" datatype.

Cheers,

Patrick

Received on Tuesday, 3 September 2002 04:04:08 UTC