Re: Dave Reynolds: rdf:text - clarification requested from Dave Reynolds on 2008-12-10 (public-rdf-text@w3.org from October to December 2008)

From: Dave Reynolds <der@hplb.hpl.hp.com>
Date: Wed, 10 Dec 2008 16:22:34 +0000
To: Axel Polleres <axel.polleres@deri.org>
CC: Sandro Hawke <sandro@w3.org>, public-rdf-text@w3.org, "Seaborne, Andy" <andy.seaborne@hp.com>
Message-ID: <493FECCA.2000508@hplb.hpl.hp.com>
Axel Polleres wrote:
> Sandro Hawke wrote:
>> Dave:
>>>>> Please could I get some clarification on the following line from 
>>>>> the rdf:text document:
>>>>>
>>>>> """In addition to the RIF and OWL specifications, this datatype is 
>>>>> expected to supersede RDF's plain literals with language tags, cf. 
>>>>> [5], which is why this datatype has been added into the rdf: 
>>>>> namespace. """
>>>>>
>>>>> I don't recall any discussion on this notion of "supersede". What 
>>>>> exactly is being proposed here? I have been regarding rdf:text as a 
>>>>> formalization of RDF plain literals with language tags which 
>>>>> simplifies OWL2/RIF's job but is not a change to RDF.  Clearly any 
>>>>> change to the RDF specs would have implications for tools 
>>>>> developers, especially if there are any round-tripping 
>>>>> requirements, and wouldn't be something to make likely. I don't 
>>>>> think it appropriate to hint at such a change in the rdf:text 
>>>>> document without more details.
>>>>>
>>>>> Apologies for not having noticed this line earlier.
>>
>> Axel:
>>>> My personal opinion: I is not the intention to change/affect the 
>>>> existinfg RDF specs, but the datatype is indeed intended to fix the 
>>>> mismatch between plain literals and language tagged literals for 
>>>> impementations which adopt it.
>>
>> Dave:
>>> Is that something that needs "fixing"?
>>>
>>> In i18n terms then internationalized text and strings are quite 
>>> different things. The differences between the two in RDF have not 
>>> been a problem in implementations or practice that I'm aware of. Is 
>>> there any evidence to suggest otherwise?
>>>
>>> I thought rif:text, now rdf:text, was invented to simplify including 
>>> internationalized strings in RIF not to fix some problem with RDF.
>>>
>>>> Any suggestion for a rewording that would rather convey this message?
>>> We need clarity on what is being proposed before thinking of a wording.
>>>
>>> Are you intending or expecting that RDF implementations should 
>>> explicitly support rdf:text as a datatype?
>>>
>>> So that they would regard:
>>>
>>> (a)    eg:a eg:p  "foo"@en .
>>>
>>> and
>>>
>>> (b)    eg:a eg:p  "foo@en"^^rdf:text .
>>>
>>> as equivalent graphs?
> 
> IMO yes, implementations that are rdf:text-aware should treat these 
> equivalently.

So an rdf:text-aware RDF processor would be different from currently 
deployed RDF processors.

>> I'm not totally sure which kind of equivelence is right here, but I'm
>> inclined to make it as close to identity as possible.  That is, (a) and
>> (b) are two different ways to serialize the same triple.
> +1
> 
>> So it's a lot like these two Turtle documents
>>
>> (c)    @prefix foo: <http://example.org/abc#>
>>        foo:a foo:p  "hello"
>>
>> and
>>
>> (d)    @prefix bar: <http://example.org/abc#>
>>        bar:a bar:p  "hello"
>>         which are different text but serialize the same RDF graph.
>>
>> For simplicity of implementation, I think RDF serializations should
>> mandate use of one style of language tagging or the other.  In order to
>> handle legacy syntaxes which were created before rdf:text and so could
>> not pick, I think we should probably say rdf:text SHOULD NOT be used in
>> any RDF syntax which has built-in support for language tagging (in order
>> to avoid all the problems you name, below).
>>
>> That is, in RDF/XML, N-Triples, N3, and Turtle, 
> 
> note: the latter three are "only" member- or team submissions. 

Trivial correction but n-triple is part of the RDF Core specs; however, 
since it is not recommended for interchange then it is not so relevant.

> a 
> standard emerging from these could fix that, and likewise could the 
> upcoming re-launch of DAWG (in case that there is support to add 
> datatype support to SPARQL in that round).

"Fix" what? N3/Turtle/SPARQ have a perfectly good syntax for 
internationalized text which map to RDF language-tagged literals.

>> one SHOULD NOT use
>> rdf:text.  (Happily, this aligns with rdf-syntax saying "Any other names
>> are not defined and SHOULD generate a warning when encountered, but
>> should otherwise behave normally.")  Meanwhile, the various RIF syntaxes
>> and the newer OWL syntaxes do not directly support language tagging, so
>> one has to use rdf:text.  Perhaps a Turtle 1.1 would remove type-a
>> language tagging and mandate rdf:text instead. Similarly, APIs are free
>> to pick one or the other (or some other, equivalent) approach, but
>> should probably just provide one, and certainly not distinguish between
>> the two.
>>
>>> Would there be any expectation on round tripping so that an RDF 
>>> processor receiving a graph in form (b) would be expected to return 
>>> it in the same form and not normalize it to form (a)?
>>
>> With the above formulation, this problem doesn't come up. 
>>> When we originally proposed rif:text I was expecting to translate RDF 
>>> lang-tagged literals to rif:text as part of a translator and rif:text 
>>> would not appear as a datatype in the RDF.
>>>
>>> Implementing rif:text as a RDF datatype is clearly possible but the 
>>> discontinuity introduced by changing RDF would be a serious concern. 
>>> We could end up in a state where some RDF producers thought form (b) 
>>> was a legal way to exchange an internationalized text fragment in RDF 
>>> while a fraction of deployed RDF consumers would not consume it (at 
>>> least not with the required semantics).
>>
>> Agreed, this is a problem we should avoid.
>>
>>> If only RIF and RDF were involved then my preferred phrasing would be 
>>> something like:
>>>
>>> "Note that the rdf:text datatype is purely intended for use within 
>>> RIF and it is not intended that RDF processors should support this as 
>>> an RDF datatype. Consumers of RDF-RIF combinations are expected to 
>>> map between RDF language-tagged literals and rdf:text literals as 
>>> part of the RIF translation process."
>>>
>>> Indeed it might be better still to substitute "SHOULD NOT" (in the 
>>> RFC2119 sense) for "not intended".
>>>
>>> However, that doesn't cover OWL2. I don't understand enough of OWL2's 
>>> requirements here, and how interoperation with deployed RDF is 
>>> envisaged, to be able to suggest anything specific.
> 
> I see the concerns, however, supporting it in OWL2 and not in RDF sounds 
> kind of weird, assuming that OWL2 will have an RDF serialization... no?

Surely OWL2 serialization will map literals which it treats as of type 
rdf:text to language-tagged literals in the RDF serialization. No?

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Wednesday, 10 December 2008 16:24:02 UTC