- From: David Wood <david@3roundstones.com>
- Date: Wed, 7 Sep 2011 10:49:25 -0400
- To: Sandro Hawke <sandro@w3.org>
- Cc: Pat Hayes <phayes@ihmc.us>, RDF Working Group WG <public-rdf-wg@w3.org>, Ivan Herman <ivan@w3.org>
On Sep 7, 2011, at 10:40, Sandro Hawke wrote: > [I changed to Subject to separate the thread from the thread about the > voting process.] > > On Tue, 2011-09-06 at 23:10 -0500, Pat Hayes wrote: >> OK, sorry this is late, but here is my best attempt to summarize the various options for how to handle datatyping of tagged literals. I have tried to be objective and up to date, but feel free to correct any mistakes y'all might still find here. Thanks to Pierre-Antoine and Richard for recent corrections. >> >> Throughout, I will illustrate with the literal "foo"@tag. In some cases it is necessary to distinguish this surface syntax from the abstract "real" syntax form. As SPARQL refers to the 'lexical form' of a literal, which has to be a string, to be returned by STR(), I will list what this is in each case. >> >> In all cases, the value is the pair <"foo", tag>. >> >> 1. Current state: tagged literals have no type. >> >> 2. Lexical form is "foo", datatype is rdf:TaggedLiteral. There are various ways to "fix" the spec to make this possible: >> >> 2a. Abstract syntax is a pair <"foo", str>, and we modify the RDF datatype definitions to allow an L2V mapping from pairs to pairs. (Pain: major change to specs, possible clash with OWL and XSD specs.) >> 2b. There is no L2V mapping, and this datatype is anomalous but specified by the RDF semantics directly, and is a datatype by fiat. (Pain: this datatype is anomalous and must not be used with the ^^ syntax.) >> 2c. The abstract syntax has no lexical form, the dataype is empty and the L2V is the empty mapping. Nevertheless, the value is linked to the present syntax by the RDF semantics directly and this is a datatype by fiat. (Pain: overly elaborate; the idea of an empty datatype is confusing, and having an L2V map which does not specify the actual value is even more confusing :-).)(Positive: the illegality of literals of the form "string"^^rdf:TaggedLiteral falls out automatically.) >> >> 3. Lexical form is "foo", datatype is unique to the tag, ie there is one datatype per tag. These are conventional datatypes with a welldefined L2V mapping. Again there are several (well, two) options based on this idea. >> >> 3a. We invent an IRI naming convention for these datatypes, eg rdf:taggedLiteral/tag. Then this is the type of the literal. (Pain: inventing this open-ended naming convention.) >> 3b. These per-tag datatypes are all anonymous and have no IRI, but are sub-datatypes of rdf:TaggedLiteral, which is returned as the type for them all. (Pain: overly elaborate; potentially confusing; need to define a new notion of sub-datatype.) >> >> 4. Lexical form is "foo@tag", where tag is required to be nonempty and not contain '@' (just as in the rdf:PlainLIteral spec). This is a conventional datatype (it is rdf:PlainLiteral restricted to nonempty tags) with a conventional L2V mapping. (Pain: might be considered to be the wrong lexical form (??)) (Positive: conforms closely to existing specs; simple; extra tag information might be useful?) > > As I read this over, and as I discussed it with colleagues, 3a seems > like the clear winner to me. The drawback is trivial, where all the > others have serious drawbacks. Option 4 would be my second choice; > it's ugly but works. > > Option 2 might be worse than Option 1; to put it simply, it seems to be > making tagged literals be datatyped literals by making up a new, > different, *non-XML-standard* sort of datatyped literal. That seems > like a problem, and I'd expect objections from lots of folks to that, > once they got a good look at it. > > Am I missing something? Unless I am :) RDF has always had one extension mechanism; the XSD datatypes. Option 3 would give us a new space of datatypes, whose number is unbounded and whose form is orthogonal to XSD. That's actually fine with me, but I didn't see it listed as a detriment and perhaps it should be. Certainly it will cause substantial rewording of RDF Concepts. Regards, Dave > > -- Sandro > > >> ------ >> >> On balance, my own vote is for either 2b or 4, and the longer I think about it, the better 4 looks after all. If we choose one of the 2 family, I would plead editorial discretion to be allowed to choose among them depending on which one fits best with the semantics, when we get down to details. They differ only in theoretical issues. Well, OK, I give up on 2a. >> >> Pat >> >> ------------------------------------------------------------ >> IHMC (850)434 8903 or (650)494 3973 >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 mobile >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> >> > > >
Received on Wednesday, 7 September 2011 14:49:56 UTC