Re: Dave Reynolds: rdf:text - clarification requested from Sandro Hawke on 2008-12-10 (public-rdf-text@w3.org from October to December 2008)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 10 Dec 2008 09:21:49 -0500
To: Dave Reynolds <der@hplb.hpl.hp.com>
cc: Axel Polleres <axel.polleres@deri.org>, public-rdf-text@w3.org, "Seaborne, Andy" <andy.seaborne@hp.com>
Message-ID: <12139.1228918909@ubuhebe>
Dave:
> >> Please could I get some clarification on the following line from the 
> >> rdf:text document:
> >>
> >> """In addition to the RIF and OWL specifications, this datatype is 
> >> expected to supersede RDF's plain literals with language tags, cf. 
> >> [5], which is why this datatype has been added into the rdf: 
> >> namespace. """
> >>
> >> I don't recall any discussion on this notion of "supersede". What 
> >> exactly is being proposed here? I have been regarding rdf:text as a 
> >> formalization of RDF plain literals with language tags which 
> >> simplifies OWL2/RIF's job but is not a change to RDF.  Clearly any 
> >> change to the RDF specs would have implications for tools developers, 
> >> especially if there are any round-tripping requirements, and wouldn't 
> >> be something to make likely. I don't think it appropriate to hint at 
> >> such a change in the rdf:text document without more details.
> >>
> >> Apologies for not having noticed this line earlier.

Axel:
> > My personal opinion: I is not the intention to change/affect the 
> > existinfg RDF specs, but the datatype is indeed intended to fix the 
> > mismatch between plain literals and language tagged literals for 
> > impementations which adopt it.

Dave:
> Is that something that needs "fixing"?
> 
> In i18n terms then internationalized text and strings are quite 
> different things. The differences between the two in RDF have not been a 
> problem in implementations or practice that I'm aware of. Is there any 
> evidence to suggest otherwise?
> 
> I thought rif:text, now rdf:text, was invented to simplify including 
> internationalized strings in RIF not to fix some problem with RDF.
> 
> > Any suggestion for a rewording that would rather convey this message?
> 
> We need clarity on what is being proposed before thinking of a wording.
> 
> Are you intending or expecting that RDF implementations should 
> explicitly support rdf:text as a datatype?
> 
> So that they would regard:
> 
> (a)    eg:a eg:p  "foo"@en .
> 
> and
> 
> (b)    eg:a eg:p  "foo@en"^^rdf:text .
> 
> as equivalent graphs?

I'm not totally sure which kind of equivelence is right here, but I'm
inclined to make it as close to identity as possible.  That is, (a) and
(b) are two different ways to serialize the same triple.

So it's a lot like these two Turtle documents

(c)    @prefix foo: <http://example.org/abc#>
       foo:a foo:p  "hello"

and

(d)    @prefix bar: <http://example.org/abc#>
       bar:a bar:p  "hello"
        
which are different text but serialize the same RDF graph.

For simplicity of implementation, I think RDF serializations should
mandate use of one style of language tagging or the other.  In order to
handle legacy syntaxes which were created before rdf:text and so could
not pick, I think we should probably say rdf:text SHOULD NOT be used in
any RDF syntax which has built-in support for language tagging (in order
to avoid all the problems you name, below).

That is, in RDF/XML, N-Triples, N3, and Turtle, one SHOULD NOT use
rdf:text.  (Happily, this aligns with rdf-syntax saying "Any other names
are not defined and SHOULD generate a warning when encountered, but
should otherwise behave normally.")  Meanwhile, the various RIF syntaxes
and the newer OWL syntaxes do not directly support language tagging, so
one has to use rdf:text.  Perhaps a Turtle 1.1 would remove type-a
language tagging and mandate rdf:text instead.  Similarly, APIs are free
to pick one or the other (or some other, equivalent) approach, but
should probably just provide one, and certainly not distinguish between
the two.

> Would there be any expectation on round tripping so that an RDF 
> processor receiving a graph in form (b) would be expected to return it 
> in the same form and not normalize it to form (a)?

With the above formulation, this problem doesn't come up.  

> When we originally proposed rif:text I was expecting to translate RDF 
> lang-tagged literals to rif:text as part of a translator and rif:text 
> would not appear as a datatype in the RDF.
> 
> Implementing rif:text as a RDF datatype is clearly possible but the 
> discontinuity introduced by changing RDF would be a serious concern. We 
> could end up in a state where some RDF producers thought form (b) was a 
> legal way to exchange an internationalized text fragment in RDF while a 
> fraction of deployed RDF consumers would not consume it (at least not 
> with the required semantics).

Agreed, this is a problem we should avoid.

> If only RIF and RDF were involved then my preferred phrasing would be 
> something like:
> 
> "Note that the rdf:text datatype is purely intended for use within RIF 
> and it is not intended that RDF processors should support this as an RDF 
> datatype. Consumers of RDF-RIF combinations are expected to map between 
> RDF language-tagged literals and rdf:text literals as part of the RIF 
> translation process."
> 
> Indeed it might be better still to substitute "SHOULD NOT" (in the 
> RFC2119 sense) for "not intended".
> 
> However, that doesn't cover OWL2. I don't understand enough of OWL2's 
> requirements here, and how interoperation with deployed RDF is 
> envisaged, to be able to suggest anything specific.

      -- Sandro
Received on Wednesday, 10 December 2008 14:22:28 UTC