W3C home > Mailing lists > Public > public-rdf-wg@w3.org > August 2011

Re: language-tagged literal datatypes

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 29 Aug 2011 17:03:01 +0100
Cc: public-rdf-wg@w3.org
Message-Id: <36421CAF-9BBF-4EC5-83A7-E7D8C82164B1@cyganiak.de>
To: antoine.zimmermann@insa-lyon.fr
On 29 Aug 2011, at 16:14, Antoine Zimmermann wrote:
> owl:real or rdf:LangString have an empty lexical space? All right, no problem, but we have to fix the definition of datatype then.

[[
The lexical space of a datatype is a set of Unicode [UNICODE] strings.
]]
http://www.w3.org/TR/rdf-concepts/#section-Datatypes

There is nothing that says it must be non-empty.

> Still, there is another problem if the lexical space is empty. If such is the case, then the interpretation of a lang-tagged literal is not defined according to the definition of interpretation of typed literal, since the interpretation of a typed literal must be given by the L2V mapping. So we end up with an exception for the interpretation of lang-tagged literal.

Quoting my proposal from further up in the thread:

>>>>> B1y: We keep the RDF 2004 definition of datatypes. rdf:LangString is
>>>>> a datatype with empty lexical space and empty L2V mapping. The value
>>>>> of a literal is given by the L2V mapping only if its datatype is not
>>>>> rdf:LangString. The value of an rdf:LangString literal is the
>>>>> tuple <lexicalform, langtag>.

It takes care of the issue you describe.

> So in the end, we wanted to get rid of the exceptional case of "plain literals" but we actually keep it.

We wanted to get rid of the exception that there is no datatype IRI that one can use to refer to the set of language-tagged strings. The proposal removes that exception.

> So in the end, we are not improving the situation, we just give the illusion that it is more uniform by merely naming the thing "datatype", but in truth it is still a completely different kind of literal, both syntactically and semantically.

Syntactically it is a different kind of literal. Semantically it is the same as any other literal – a value that is in the values pace of a datatype.

> For those reasons, I advocate either a solution that really sticks to the 2004 definition, or Pat's solution.

AFAICT it *does* really stick to the 2004 definition. What am I missing?

Richard



> (Sorry if this sounds a bit aggressive, I'm not actually that much vehement about this issue).
> 
> 
> Regards,
> AZ
> 
> 
> 
> 
> Le 29/08/2011 16:11, Pat Hayes a écrit :
>> I agree with Andy. The key points about a datatyped literal are that it (1) display the datatype IRI and (2) this datatype provides an unambiguous way to determine the syntactic correctness of the literal and if it is correct, a way to compute its value. All the rest of the machinery is simply there to serve this purpose. Cleaving to the current mathematical/semantic rules is good to avoid unnecessary disruption, but they should not be considered inviolate. In this spirit, I now think Richard's 2b option to be the best way to proceed, and we will just swallow the oddity of rdf:LangString not having a conventional L2V mapping. In all other respects, this retains exactly what we all need and desire. It is the simplest and least disruptive option.
>> 
>> Pat
>> 
>> 
>> On Aug 29, 2011, at 8:57 AM, Andy Seaborne wrote:
>> 
>>> 
>>> 
>>> On 29/08/11 09:09, Antoine Zimmermann wrote:
>>>> Richard,
>>>> 
>>>> Le 27/08/2011 18:29, Richard Cyganiak a écrit :
>>>>> Hi Antoine,
>>>>> 
>>>>> Your summary omits the option that I've been advocating recently:
>>>>> 
>>>>> On 26 Aug 2011, at 16:22, Antoine Zimmermann wrote:
>>>>>> B. What if it is a datatype?
>>>>>> 
>>>>>> If it's a datatype, rdf:LangString must have exactly all the
>>>>>> characteristics that all datatypes have. It must follow the
>>>>>> definition. Now, the definition can either be kept as is (as in RDF
>>>>>> 2004) or modified. B1: If we keep the RDF 2004 definition, I think
>>>>>> we can't really do any better than what the rdf:PlainLiteral spec
>>>>>> is doing. The difference is that we do not need to deal with
>>>>>> untagged plain literals, so the lexical form is quite
>>>>>> straightforward: "foo@en"^^rdf:LangString would be the abstract
>>>>>> syntax of "foo"@en.
>>> 
>>> str("foo"@en) is already "foo"
>>> 
>>>>> 
>>>>> B1y: We keep the RDF 2004 definition of datatypes. rdf:LangString is
>>>>> a datatype with empty lexical space and empty L2V mapping. The value
>>>>> of a literal is given by the L2V mapping only if its datatype is not
>>>>> rdf:LangString. The value of an rdf:LangString literal is the
>>>>> tuple<lexicalform, langtag>.
>>>> 
>>>> This is inconsistent. You cannot give a formal definition of datatypes,
>>>> then have an instance of datatypes that does not follow the definition.
>>>> It's like saying that prime numbers are numbers that are only dividable
>>>> by 1 and by themselves and that 9 is a prime number dividable by 1, 3
>>>> and 9. It cannot be, it's broken.
>>>> 
>>>> If you really want to keep the definition as of 2004, then
>>>> rdf:LangString must /either/ have a non-empty lexical space and an L2V
>>>> mapping /or/ it is not a datatype. There is no other possibility.
>>> 
>>> owl:Real is a existing counterexample to the 2004 defn.  "The owl:real datatype does not directly provide any lexical forms."
>>> 
>>> The "non-empty lexical space" restriction seems to be artificial. Empty, and some other way to form values, works just as well albeit with specific rules for each datatype.
>>> 
>>> 	Andy
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Antoine Zimmermann
> Researcher at:
> Laboratoire d'InfoRmatique en Image et Systèmes d'information
> Database Group
> 7 Avenue Jean Capelle
> 69621 Villeurbanne Cedex
> France
> Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
> Lecturer at:
> Institut National des Sciences Appliquées de Lyon
> 20 Avenue Albert Einstein
> 69621 Villeurbanne Cedex
> France
> antoine.zimmermann@insa-lyon.fr
> http://zimmer.aprilfoolsreview.com/
> 
Received on Monday, 29 August 2011 16:03:30 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:44 GMT