Re: Proposal for ISSUE-12 language-tagged literals from Andy Seaborne on 2011-07-20 (public-rdf-wg@w3.org from July 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 20 Jul 2011 23:07:40 +0100
To: public-rdf-wg@w3.org
Message-ID: <4E2751AC.6060206@epimorphics.com>
On 20/07/11 20:53, Richard Cyganiak wrote:
> On 20 Jul 2011, at 19:54, Andy Seaborne wrote:
>> On 20/07/11 19:36, Richard Cyganiak wrote:
>>> On 16 Jul 2011, at 16:52, Andy Seaborne wrote:
>>>> I'd rather make DATATYPE("foo"@en) be honest and say that it
>>>> returns datatype rdf:LangString.
>>>
>>> You cannot do so without a hack.
>>
>> You've lost me.  It puts literals in the RDF graph (old speak:
>> abstract syntax) and they really do have a datatype.
>
> Datatypes currently are the part of a typed literal that determines
> how to get from a lexical form to a value.
>
> What you propose is that datatypes should also be used for certain
> *non-typed* literals, and *without* mapping lexical forms to values.

This would make them typed.  In fact, I don't see where RDF concepts it 
defines plain literals to have a value at all.  It seems to only mention 
values in regards to typed literals.  That makes talking about the value 
space of rdf:LangString being <lex, lang> wrong.  Something to clean up 
whatever we decide.

A datatype, in the abstract syntax, is an annotation on literals.  A 
literal is a triple (!pun!) of (lexical form, datatype, lang tag) and 
had a value.  Literals identify values (RDF concepts).

"""
A typed literal is a string combined with a datatype URI. It denotes the 
member of the identified datatype's value space obtained by applying the 
lexical-to-value mapping to the literal string.
"""

what changes is that literals denotes a value, without the requirement 
it is obtained by L2V which is a parser concept.

So all literals would have a datatype.  No exceptions.
It makes code libraries simpler and application that process RDF at the 
abstract syntax level simpler.

If that change is something that will bring grief to apps and system 
providers, let's not do it but we're already changing it for xsd:string.

[1] http://www.w3.org/TR/rdf-concepts/#dfn-plain-literal

> That's why I call it a hack.

I asked:
 >> What breaks if it is a datatype?

So far, nothing has been shown to break.

The datatype concept as it exists seems to mix mapping and value.

> I'm not saying that this makes it a no-go. But if the hack exists
> only to make DATATYPE("foo"@en) behave more consistently in SPARQL,
> then I'd rather see the hack in SPARQL.

If it were only SPARQL , I'd agree but this seems to make RDF more 
regular (note - not perfectly regular).

>
>> It then works for RIF and anything else built to work with RDF.
>
> No, unfortunately it doesn't, at least as far as I can tell. They
> actually want to have lexical forms for language-tagged literals, so
> that they can stuff the<string,langtag>  pairs into legacy systems
> that don't support language tags. (Or, perhaps closer to the truth,
> so that they can be compatible with RDF's data model in their specs
> without actually supporting language tags in their literal design.)

Actually they can do that because if the lexical form of 
rdf:PlainLiterals is a superset of the lexical forms rdf:LangString, it 
can be defined so that rdf:LangString is a derived type (the inverse 
term to "derived" does not seem to be defined in /TR/xpath-datamodel/).

It's making rdf:PlainLiteral a super-datatype of xsd:string that does 
not work.

> Thought experiment: If DATATYPE in SPARQL was called something else
> instead, say, “TYPE” (and it would return some magic constant for
> IRIs and blank nodes), would you still advocate making rdf:LangString
> a datatype instead of a class? If yes, then why?

Yes, if it can be made to work.  DATATYPE is an accessor to that part of 
the literal triple.  All literals would have a datatype in the abstract 
graph.

>
> Best, Richard

	Andy
Received on Wednesday, 20 July 2011 22:08:10 UTC