W3C home > Mailing lists > Public > public-rdf-wg@w3.org > August 2011

Re: language-tagged literal datatypes

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 19 Aug 2011 11:00:05 -0500
Cc: "public-rdf-wg@w3.org Group WG" <public-rdf-wg@w3.org>
Message-Id: <6057830A-7E81-4E57-92B9-910986E95AA0@ihmc.us>
To: Richard Cyganiak <richard@cyganiak.de>
So let me get this clear. This exceptional datatype associates the lexical form <string, tag> to the identical value <string, tag>, but its L2V mapping is not the identity map, because it doesn't have an L2V mapping? This seems completely insane to me, and I don't quite know how I would justify it to a cynical reader, but whatever. There really is no practical difference for users between 2 and 2b, so at this point we are arguing about theoretical elegance rather than about anything that actually matters. 

On Aug 19, 2011, at 8:28 AM, Richard Cyganiak wrote:

> On 19 Aug 2011, at 00:11, Pat Hayes wrote:
>> Option 2. All literals have a type. rdf:LangString is a special datatype whose L2V mapping takes a pair of strings as input and returns a language-tagged pair as output. This mapping is the identity mapping on pairs <string, tag>, just as xsd:String is the identity mapping on single strings. DATATYPE("foo"@en) returns rdf:LangString, following the normal rules for datatyping. 
> There's also 2b:
> All literals have a type. rdf:LangString is a special type, where the lexical form is <string,langtag> rather than just a string, and it doesn't have an L2V mapping. The value of an rdf:LangString literal is the same as the lexical form. DATATYPE("foo"@en) returns rdf:LangString, following the normal rules.
> (The advantage of 2b versus 2 is that the L2V mechanism can remain unchanged. It can remain defined as functions from string to value, rather than functions from anything to value as required by 2. In 2, the L2V of rdf:LangString is just the trivial identity mapping anyways, and resorting to the L2V mapping device just to explain a no-op mapping is overkill.)

Well, its not exactly *hard* to say that mapping <a,b> to <a,b> is an identity mapping. 

> (2b also makes it easy to re-write the rdf:PlainLiteral spec into a spec titled “An L2V mapping for rdf:LangString” that just defines an L2V mapping that takes "foo@en" to <"foo","en">, while keeping the current restrictions on use of such lexical forms. So I'd hope it would be an easier sell to the OWL/RIF WGs.)
>> option 2: + simplifies literal syntax + removes SPARQL errors + theoretically clean -- requires change to the datatyping model
> option 2b: + simplifies literal syntax + removes SPARQL errors + no changes to datatyping model -- introduces one exceptional datatype that works differently from all others
>> If we say that the L2V mapping takes as input all the syntactic  'components' of a literal, rather than forcing these to be all inside one string, then we allow such things as literals with latitude and longitude denoting positions, complex numbers with real and imaginary parts, etc.., without forcing people to invent coding tricks (like the trailing '^' in rdf:PlainLiteral) to artificially map these into a single string. This might be a genuinely useful extension, in other words.
> Being able to express lat/long pairs and complex numbers in the abstract syntax isn't really if you have no way of writing them down in a concrete syntax. So you either still need to squish them into a single string, or extend your RDF syntax of choice with additional syntactic sugar for expressing that kind of literal.

Right, but after living through the rdf:PlainLiteral mess, I think that inventing a new literal syntax is easy compared to the problems which arise when you have to force everything into a single string. And I bet that others will agree, later. After all, RDF parsers right now manage to handle "foo"@"en" without barfing, so it is *possible* to have more than one string in a literal syntax. But, as I say, whatever...

>> We can also quietly deprecate rdf:PlainLiteral along with 8-track tape players.
> A major motivation for rdf:PlainLiteral is the desire to stick <string,langtag> pairs into a single string, so I'm afraid it won't be quite as easy.

Sad, if true :-)


> Best,
> Richard

IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 19 August 2011 16:00:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:04:08 UTC