Re: Proposal for ISSUE-12 language-tagged literals from Richard Cyganiak on 2011-07-21 (public-rdf-wg@w3.org from July 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 21 Jul 2011 13:24:01 +0100
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
Message-Id: <774EAAD9-F2C0-4773-B49D-E24B31D54FC6@cyganiak.de>
On 20 Jul 2011, at 23:07, Andy Seaborne wrote:
>> What you propose is that datatypes should also be used for certain
>> *non-typed* literals, and *without* mapping lexical forms to values.
> 
> This would make them typed.  

That's a rather deep change. Language-tagged literals and typed literals are disjoint in RDF Concepts.

> In fact, I don't see where RDF concepts it defines plain literals to have a value at all. It seems to only mention values in regards to typed literals.  That makes talking about the value space of rdf:LangString being <lex, lang> wrong.  

Well, the spec as written uses “denotes XYZ” and “identifies a value XYZ” interchangeably.

> A datatype, in the abstract syntax, is an annotation on literals.  A literal is a triple (!pun!) of (lexical form, datatype, lang tag) and had a value.  

This would imply that a literal can have a datatype *and* a language tag. This design was proposed earlier, but drew lots of objections, including from you.

> what changes is that literals denotes a value, without the requirement it is obtained by L2V

There is no such requirement. Language-tagged literals are defined as being “self-denoting”, in other words, their value is a lexical form and a language tag.

> If that change is something that will bring grief to apps and system providers, let's not do it but we're already changing it for xsd:string.

The xsd:string *removes* something from the abstract syntax which I imagine is a lot less troublesome for implementations.

> I asked:
> >> What breaks if it is a datatype?
> 
> So far, nothing has been shown to break.

*You* pointed out what breaks:
http://lists.w3.org/Archives/Public/public-rdf-wg/2011May/0343.html

To paraphrase your own words:

if (this.datatype() != null) {
  // typed
} else if (this.languageTag() != null) {
  // plain literal w/ language tag
} else {
  // plain literal w/o language tag
}

I found this objection quite compelling and gave up on the idea at that point.

> The datatype concept as it exists seems to mix mapping and value.

RDF Concepts is sloppy in that regard. RDF Semantics says that literals *denote*. Plain literals denote themselves. Typed literals denote the result of applying L2V to the lexical form.

Richard




>> I'm not saying that this makes it a no-go. But if the hack exists
>> only to make DATATYPE("foo"@en) behave more consistently in SPARQL,
>> then I'd rather see the hack in SPARQL.
> 
> If it were only SPARQL , I'd agree but this seems to make RDF more regular (note - not perfectly regular).
> 
>> 
>>> It then works for RIF and anything else built to work with RDF.
>> 
>> No, unfortunately it doesn't, at least as far as I can tell. They
>> actually want to have lexical forms for language-tagged literals, so
>> that they can stuff the<string,langtag>  pairs into legacy systems
>> that don't support language tags. (Or, perhaps closer to the truth,
>> so that they can be compatible with RDF's data model in their specs
>> without actually supporting language tags in their literal design.)
> 
> Actually they can do that because if the lexical form of rdf:PlainLiterals is a superset of the lexical forms rdf:LangString, it can be defined so that rdf:LangString is a derived type (the inverse term to "derived" does not seem to be defined in /TR/xpath-datamodel/).
> 
> It's making rdf:PlainLiteral a super-datatype of xsd:string that does not work.
> 
>> Thought experiment: If DATATYPE in SPARQL was called something else
>> instead, say, “TYPE” (and it would return some magic constant for
>> IRIs and blank nodes), would you still advocate making rdf:LangString
>> a datatype instead of a class? If yes, then why?
> 
> Yes, if it can be made to work.  DATATYPE is an accessor to that part of the literal triple.  All literals would have a datatype in the abstract graph.
> 
>> 
>> Best, Richard
> 
> 	Andy
>
Received on Thursday, 21 July 2011 12:24:41 UTC