- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Tue, 21 May 2002 16:34:29 +0200
- To: RDF Core <w3c-rdfcore-wg@w3.org>
At the last telecon we briefly discussed the issue related to the semantics of literals. Per F2F decision, the literals have three components (unicode string, language tag, and a bit). This representation may not be the best. Here are several concerns: (1) Interpretation It is unclear what the literals represent. It seems that a literal can denote a) a character string b) a word in a natural language c) an XML tree d) an abstract structure that consists of a string, a tag, and a bit. Choice d) seems ugly if we think of RDF as a foundation for the SW. If we go for a)-c), then the literals become polymorphic... Furthermore, defining rules for comparing trees and words seems counterproductive. (2) Extensibility The language tags keep evolving. How do we accommodate new language encoding schemes gracefully? The current XML standard may be surpassed. How do we indicate what particular XML encoding or canonical form (or maybe a completely different graph-like structure) is used? In short, I think that we might be doing a bad job on literals. I'm afraid that additional difficulties may arise in datatyping (e.g., we might need to deal with XML trees in lexical spaces of datatypes). BTW, did TimBL and DanC, the original issue raisers, finally take a position to the F2F decision (comp. [1])? Unfortunately, I missed that F2F. A cleaner solution might be/have been to leave literals as strings and to use bNodes with special properties for representing words and XML structures. Sergey [1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Apr/0410.html
Received on Tuesday, 21 May 2002 10:34:31 UTC