Re: Datayped tagged literals: a case for option 4 vs option 2d

On 26/09/11 09:50, Antoine Zimmermann wrote:
> All,
> I'd like to discuss here 2 options for lang tagged literals, viz.,
> option 2d (Richard's proposal) and option 4 (one datatype with non-empty
> lexical space).
> Richard's solution consists in adding a datatype with empty lexical
> space, and define the abstract syntax and semantics of these literals in
> a ad hoc fashion.
> I don't have a problem with having an empty lexical space but I find
> issues in that proposal because I find that it does not make things more
> uniform than previously, with the only exception that tagged literals
> would have a datatype.
> - First, syntactically, tagged literals are an exception to the standard
> typed literals, since it would be inconsistent to write anything of the
> form "xxx"^^rdf:LangString.
> - Second, semantically, tagged literals are an exception too, since
> standard tagged literals are normally interpreted according to the L2V
> mapping, which is empty in this case.
> - Third, in SPARQL, the DATATYPE keyword would have to be redefined with
> an exception, since currently SPARQL says nothing about typed literals
> which cannot be written in the form "xxx"^^dt. I especially don't like
> when the RDF working group imposes a change to another WG's
> specification, especially at such a late stage.

In SPARQL, under option 2d,

DATATYPE("xyz"@en) would be rdf:LangString

because RDF says so.  There is no SPARQL exception.

> In contrast, I find that making rdf:LangString a "normal" datatype with
> a non-empty lexical space makes everything more uniform (option 4).
> - Syntactically, xxx@lll would simply be a shortcut for the abstract
> syntax "xxx@lll"^^rdf:LangString. The only exceptional feature would be
> that we recommend the concrete syntax xxx@lll, but we already made such
> an exception for "xxx"^^xsd:string, which we recommend to write "xxx".
> - Semantically, tagged literals would be interpreted as standard typed
> literals through the L2V mapping.
> - In terms of SPARQL specs, DATATYPE(xxx@lll) would be rdf:LangString,
> as required already by SPARQL without any change (since it is in fact
> the same as DATATYPE("xxx@lll"^^rdf:LangString)).

This is not accurate:

1/ Currently (SPARQL 1.0, SPARQL 1.1 LC) DATATYPE("xxx"@lll) is an error 
so there is change.

2/ DATATYPE("xxx"@lll) would be rdf:LangString simply because RDF says 
it is which ever of option 2b or option 4 is chosen, just as 
DATATYPE(123) says that's an xsd:integer in the abstract model.

> Now, responding a concern raised by Andy who said that no matter the
> option chosen, tagged literals are made "special" in some way [1]. I do
> not think so. Option 4 makes lang tags part of the lexical form, such
> that language must be accessed by parsing the literal. That's how
> information from a literal should always be accessed. For instance, time
> zone, year, hour, date in xsd:datetimeStamp are obtained by parsing the
> lexical form. Same for exponent in xsd:float, same for any component of
> any typed literals. Why should it be different for lang tagged strings?
> In terms of pure specs, I think option 4 is much more elegant and easy.
> However, I understand that there are practical issues with option 4: in
> SPARQL, STR(xxx@lll) should send back "xxx@lll" instead of "xxx", unless
> an exception is added to SPARQL.

So there is a special exception in SPARQL for option 4.  This makes it 
very disruptive.

And how would you get the lexical form of a literal in SPARQL if you did 
want it?

SPARQL treats literals in the abstract syntax: there are three aspects 
(they don't have to be independent) each with an accessor:

lexical form : STR
lang tag : LANG
datatype : DATATYPE

SPARQL 1.1 adds constructors for terms: STRDT, SRTLANG.

> I would not mind getting this, but I
> understand that this can be unpleasant to some people. Other problems
> exist wrt APIs, and this may have consequences on existing
> implementation. I am not sure to what extent this is causing troubles.

This would be a huge amount of trouble.

Are there any RDF systems that would not be affected?

> All in all, I would not be against Richard's "minimal" proposal since it
> does not imply dramatic changes and could be integrated quite smoothly,
> but I still have a strong preference for option 4.
> [1]


Received on Monday, 26 September 2011 10:18:33 UTC