Proposal: Revising RDF literals and rdf:value from Dieter Köhler on 2003-04-05 (www-rdf-interest@w3.org from April 2003)

From: Dieter Köhler <dieter.koehler@philo.de>
Date: Sat, 05 Apr 2003 17:22:29 +0200
To: www-rdf-interest@w3.org
Message-Id: <5.1.1.6.0.20030405123543.01f97ec0@pop3.philo.de>
A Proposal for Revising RDF Literals and rdf:value


The proposal to revise the RDF syntax, which I am outlining here, is 
somewhat fundamental.  In the current stage of the process of completing 
the RDF specification it is perhaps too late to apply such a large 
modification.  Nevertheless I would appreciate it if somebody from the RDF 
working group could comment on it.


1. Motivation

First I want to focus on two details of RDF which are syntactically 
independent, but are related by the fact that they provide solutions for 
similar problems: The rdf:value property and RDF literals.  Both are 
indented to relate a value-type pair of a property to a subject.  In the 
case of RDF literals the language identifier and the datatype URI can be 
seen as a shortcut for a dc:Language and a rdf:type relation between the 
lexical form and the language identifier or the datatype URI 
respectively.  In the case of rdf:value the [RDF Primer] specification says 
that it contains the "main value of a structured value" so one must refer 
to "additional properties to identify the classification scheme or other 
information that further describes the value". However it is also stated 
that "RDF does not associate any particular meaning with it. rdf:value is 
simply provided as a convenience for use in these commonly-occurring 
situations." ([RDF Primer], sec. 4.4).

Because of the lack of internal meaning of the rdf:value property, I am 
afraid that a wide use of it will diminish the RDF's ability to provide a 
framework for interoperability between graphs from different origins and 
thus threaten the goal of building up a meaningful global semantic web. 
Evidence for this fear is provided by what is said in [RDF Semantics], sec. 
3.2.4: "Since the range of possible uses for rdf:value is so wide, it is 
impossible to give a precise model-theoretic statement which covers all the 
intended meanings or use cases. Users are cautioned, therefore, that the 
use of rdf:value is somewhat risky, and that it should be treated as a 
'blank' piece of RDF syntax whose meaning in any particular case should be 
defined by the user, and may vary from application to application. In 
practice, the intended meaning is often clear from the context, but may be 
lost when graphs are merged or when conclusions are inferred."

As far as I can see there are two basic reasons why rdf:value cannot have a 
particular meaning: a) There is no syntactic means to tell which property 
or which properties are essential for the classification scheme, and b) 
there is also no semantic means which asserts that b.1) there exists a 
corresponding classification and b.2) the corresponding classification 
remains intact when merging graphs or inferring conclusions. -- Or in other 
words: RDF was designed to be very flexible so that statements could be 
added or removed from existing graphs to construct an open semantic 
web.  But sometimes a property of a subject is essential and is used, 
philosophically speaking, as a differentia specifica to identify the 
subject, so that it is not advisable to alter it or remove it from a graph.

The question is whether it is possible to fix this problem on the level of 
RDF syntax without introducing too complex rules.


2. Proposal

What I want to suggest is to modify the RDF syntax so that

  - RDF literals consist only of two components: a value and a data type,
  - the required data type of an RDF literal is always a URI reference, the 
optional value can be an RDF literal or a string,
  - RDF literals can appear as subjects, predicates or objects of a statement.
  - Ordinary RDF URI references used to characterize the subject, predicate 
or object of a triple are considered to be a special case of an RDF literal 
of the data type of the specified URI reference and an empty value (nil or 
null or an empty string).

In the following I will use the notation {<valueURI><dataTypeURI>} or 
{'valueString'<dataTypeURI>} to represent such an RDF literal.

As a consequence the mapping table for datatype mappings can be described 
in RDF itself.  This allows to provide additional information about the 
mapping elements, for example:

<myConcept:True> {<language:de><map:representedBy>} {'wahr'<xsd:string>} .
<myConcept:True> {<language:en><map:representedBy>} {'true'<xsd:string>} .
<myConcept:True> {<language:it><map:representedBy>} {'vero'<xsd:string>} .
<'true'<xsd:boolean> {<map:representedBy>} <myConcept:True>.

Since RDF literals fulfill exactly the requirement to provide a stable 
connection between a value and its type, at least one essential 
characteristic of a resource can be represented. This makes rdf:value 
superfluous; and since RDF literals can be the subject of a triple, 
language identificators can be deprecated too and replaced by an additional 
triple:

{xxx<rdf:XMLLiteral>} <dc:Language> <languageIdURI>

(xxx signifies a normalized XML document. It is of course possible to use 
the N-Triple notation of language strings from [RDF Test Cases] as a 
shortcut. Note also that language identificators are not essential in the 
sense introduced above, which is also acknowledged by [RDF Semantics], sec. 
4.3: "language tags play no role in the meaning of a typed literal".)


3. Example

Here is one more example how the proposed RDF Literal can be used to 
express facts about datatype-value pairs. Note that the value of an RDF 
Literal appearing as a predicate can be seen as an equivalent to an adverb 
in natural languages. So here is the example:

The claim that "100 Kg is very likely a heavy weight" can be expressed as:

{<numeral:100><myClass:Kg>} {<probability:90%><myRdfsClone:member>} 
{<fuzzy:heavy><myClass:Weight>} .
<probability:90%> <rdfs:member> {<numeral:.9><myClass:Probability>} .
{<numeral:.9><myClass:Probability>} <rdfs:member> 
{<fuzzy:high><myClass:Probability>} .

I do not claim that it is impossible to express this claim in ordinary RDF, 
but consider it being much more complicated, leaving the exercise to the 
reader ...


4. Semantics

It is possible to map each instance of the proposed RDF Literal into an 
ordinary URI reference by a standardized method (for example by using the 
above described notation, escaping any reserved characters and adding the 
prefix 'http://w3.org/').  So the proposed RDF Literal can be seen as a 
special notation for ordinary RDF URI references.  This shows that all the 
semantics for RDF and RDFS, corresponding entailments and inference rules 
as described in [RDF Semantics] are remaining valid.  (Special inference 
rules, etc. for the proposed RDF Literals might of course be added.)


5. Restrictions

The proposed syntax allows the declaration of not more than one essential 
property.  For some use cases it may be necessary to have more than one 
essential property available.  For this purpose the proposed syntax for RDF 
Literals could be extended by allowing an infinite number of data 
types.  Whether and how this should be done is a subsequent question which 
needs only to be discussed if my proposal finds some support at all.


Dieter Köhler

Institute of Philosophy and
Center for Mulimedia Studies
University of Karlsruhe
Germany
Received on Saturday, 5 April 2003 10:21:39 UTC