Re: language-tagged literal datatypes from Richard Cyganiak on 2011-08-19 (public-rdf-wg@w3.org from August 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 19 Aug 2011 19:12:27 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: "public-rdf-wg@w3.org Group WG" <public-rdf-wg@w3.org>
Message-Id: <699E3715-578F-4D28-B770-A865A5EB8346@cyganiak.de>

On 19 Aug 2011, at 17:00, Pat Hayes wrote:
> So let me get this clear. This exceptional datatype associates the lexical form <string, tag> to the identical value <string, tag>,

Yes. The <string,tag> pair should perhaps not be called a “lexical form” but something else.

Strawman text:

“A language-tagged string consists of a lexical form that is a Unicode string, and a language tag. Its datatype IRI is rdf:LangString. Unlike in regular typed literals, no lexical-to-value mapping is associated with this datatype IRI. The value of a language-tagged string is a tuple consisting of the lexical form and the language tag: <lexicalForm, languageTag>.”

> but its L2V mapping is not the identity map, because it doesn't have an L2V mapping?

Yes.

RDF Concepts has this to say about datatypes:

* The lexical space of a datatype is a set of Unicode [UNICODE] strings.
* The lexical-to-value mapping of a datatype is a set of pairs whose
  first element belongs to the lexical space of the datatype […].
* RDF may be used with any datatype definition that conforms to this
  abstraction […].

This definition would have to be changed to accommodate approach 2. The lexical space would have to be broadened to contain … what? Everything? Unicode strings and <Unicode string, language tag> pairs? No change is required for 2b.

The spec also says:

“The datatype abstraction used in RDF is compatible with the abstraction used in XML Schema Part 2: Datatypes [XMLSCHEMA-2].”

XML Schema 1.1 Datatypes has this to say:

“In this specification, a datatype has […] a ·lexical space·, which is a set of character strings used to denote the values. […] The lexical space of a datatype is the prescribed domain of ·the lexical mapping· for that datatype.”

I wouldn't want to deviate from the XSD spec if possible.

> This seems completely insane to me, and I don't quite know how I would justify it to a cynical reader, but whatever.

We can call it “rhubarb mapping” or anything else instead of “L2V mapping” if that seems less insane to you.

> There really is no practical difference for users between 2 and 2b,

Correct.

> so at this point we are arguing about theoretical elegance rather than about anything that actually matters. 

Well, my concern is to come up with words that get us the desired effect and that don't get us in trouble with other WGs.

Looks like you don't really mind whether language-tagged strings use the lexical-to-value mapping device or not. So let's see what Andy says.

Best,
Richard

Received on Friday, 19 August 2011 18:12:56 UTC