W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > September 2001

ACTION 2001-08-02#12 : EASEL's datatypes and the literal issue

From: Jan Grant <Jan.Grant@bristol.ac.uk>
Date: Mon, 17 Sep 2001 21:53:35 +0100 (BST)
To: RDFCore Working Group <w3c-rdfcore-wg@w3.org>
Message-ID: <Pine.GSO.4.31.0109172130180.8724-100000@mail.ilrt.bris.ac.uk>
Preamble: EASEL (or the bit that I've done) approaches the problem of
cross-schema searching using the "obvious" (ie, 20 years of database
practice) method, ie, it defines precise "semantic units" or core schema
elements and various search targets declare their supported schemas in
terms of arrangements of those elements.

The problem we found was that we had a general classification of things
into two types: things which we could slap a resource (URI) on to
identify, and "Literals" for everything else. Thus, we needed an
approach to datatypes in RDF.

As I mentioned in a previous telecon, there are two attitudes you can
take towards literals:


1. They are unicode strings (plus optional language tag, etc.)

If you take this view then you're left with some slightly odd constructs
like:

	<a> <foo:blah> _:a .
	_:a <rdf:type> <foo:Baz> .
	_:a <rdf:value> "bletch" .

(which I think might be labelled "the DAML approach" simply because
that's what DAML does; in fact, if you take this view, there's little
else you have available as an option).

This is, in my opinion, a perfectly defensible point of view.


2. The other view, which is what I took for EASEL - that Literals are
values, not their representations.

"Literal" is possibly a bad term, because what crops up _in the model_
as a "Literal" I consider to be a value, not the _representation_ of a
value**. What you see in the RDF/XML serialisation is the
_representation_ of that literal. In other words, I tend to draw a big
box around the

	_:a <rdf:type> <xsd:Date> .
	_:a <rdf:value> "2001-09" .

and call all of that a literal.

Frank Manola asked "is '2001-09' an arithmetic expression or a date?";
well, it could be the serialised form of either type of literal. How to
tell them apart?

In EASEL, we had sufficient schema declarations such as*

	<easel:Date> <rdfs:subClassOf> <rdfs:Literal> .

to be able to tell what kind of literal we were expecting during
parsing. I didn't include much of a mechanism for specifying what the syntax of
a conforming literal representation lokoed like (that's a job for things
like XML schema); instead, these were just declaring the type
relationships for literals.

Then, because I knew the range of <foo:blah> is <easel:Date>, I know
that when I see

	<rdf:Description rdf:about="...">
		<foo:blah>2001-09</foo:blah>
	</rdf:Description>

the value inside the <foo:blah> tags should be interpreted as (turned
into, create an object of type, etc) a date.

Thus, a "literal" in RDF terms might be a unicode string, a date, a
number, a java object, a fragment of XML, etc.


Finally: If you don't have such range declarations to help you, I
suppose you could do something like

	<rdf:Description rdf:about="...">
		<foo:blah rdf:parseType="xsd:Date">2001-09</foo:blah>
	</rdf:Description>

... but I have my tongue half in my cheek here.


In summary: this is a slightly unorthodox approach but it lets me think
about and deal with real values, not their printstrings and proxy nodes.


jan

PS. Goes without saying: I _like_ the idea of "structured" literals.
(ie, the notion of a literal "having a language" isn't a broken one, I
think). That feeling is what (mis?)informed the approach I had with
EASEL.

* xsd:Date didn't exist when I did this originally

** not _necessarily_ the representation of a value, that is.

-- 
jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
Tel +44(0)117 9287088 Fax +44 (0)117 9287112 RFC822 jan.grant@bris.ac.uk
"My army boots contain everything not in them." - Russell's pair o' Docs.
Received on Monday, 17 September 2001 16:56:38 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:39:45 EDT