RE: Dave Reynolds: rdf:text - clarification requested from Seaborne, Andy on 2008-12-10 (public-rdf-text@w3.org from October to December 2008)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Wed, 10 Dec 2008 16:41:43 +0000
To: Axel Polleres <axel.polleres@deri.org>, Sandro Hawke <sandro@w3.org>
CC: Dave Reynolds <der@hplb.hpl.hp.com>, "public-rdf-text@w3.org" <public-rdf-text@w3.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA3561F150930@GVW1118EXC.americas.hpqcorp.net>


> -----Original Message-----
> From: Axel Polleres [mailto:axel.polleres@deri.org]
> Sent: 10 December 2008 15:43
> To: Sandro Hawke
> Cc: Dave Reynolds; public-rdf-text@w3.org; Seaborne, Andy
> Subject: Re: Dave Reynolds: rdf:text - clarification requested
>
> Sandro Hawke wrote:
> > Dave:
> >>>> Please could I get some clarification on the following line from
> the
> >>>> rdf:text document:
> >>>>
> >>>> """In addition to the RIF and OWL specifications, this datatype is
> >>>> expected to supersede RDF's plain literals with language tags, cf.
> >>>> [5], which is why this datatype has been added into the rdf:
> >>>> namespace. """
> >>>>
> >>>> I don't recall any discussion on this notion of "supersede". What
> >>>> exactly is being proposed here? I have been regarding rdf:text as a
> >>>> formalization of RDF plain literals with language tags which
> >>>> simplifies OWL2/RIF's job but is not a change to RDF.  Clearly any
> >>>> change to the RDF specs would have implications for tools
> developers,
> >>>> especially if there are any round-tripping requirements, and
> wouldn't
> >>>> be something to make likely. I don't think it appropriate to hint
> at
> >>>> such a change in the rdf:text document without more details.
> >>>>
> >>>> Apologies for not having noticed this line earlier.
> >
> > Axel:
> >>> My personal opinion: I is not the intention to change/affect the
> >>> existinfg RDF specs, but the datatype is indeed intended to fix the
> >>> mismatch between plain literals and language tagged literals for
> >>> impementations which adopt it.
> >
> > Dave:
> >> Is that something that needs "fixing"?
> >>
> >> In i18n terms then internationalized text and strings are quite
> >> different things. The differences between the two in RDF have not
> been a
> >> problem in implementations or practice that I'm aware of. Is there
> any
> >> evidence to suggest otherwise?
> >>
> >> I thought rif:text, now rdf:text, was invented to simplify including
> >> internationalized strings in RIF not to fix some problem with RDF.
> >>
> >>> Any suggestion for a rewording that would rather convey this
> message?
> >> We need clarity on what is being proposed before thinking of a
> wording.
> >>
> >> Are you intending or expecting that RDF implementations should
> >> explicitly support rdf:text as a datatype?
> >>
> >> So that they would regard:
> >>
> >> (a)    eg:a eg:p  "foo"@en .
> >>
> >> and
> >>
> >> (b)    eg:a eg:p  "foo@en"^^rdf:text .
> >>
> >> as equivalent graphs?
>
> IMO yes, implementations that are rdf:text-aware should treat these
> equivalently.
>
> > I'm not totally sure which kind of equivelence is right here, but I'm
> > inclined to make it as close to identity as possible.  That is, (a)
> and
> > (b) are two different ways to serialize the same triple.
> +1
>
> > So it's a lot like these two Turtle documents
> >
> > (c)    @prefix foo: <http://example.org/abc#>
> >        foo:a foo:p  "hello"
> >
> > and
> >
> > (d)    @prefix bar: <http://example.org/abc#>
> >        bar:a bar:p  "hello"
> >
> > which are different text but serialize the same RDF graph.

Let's see what the specs say for observable differences: I'm not arguing (much) for or against rdf:text but I do think the consequences need to be made much clearer.

Yes, (c) and (d) are the same graph because they are the same set of triples.  Prefixes are merely a syntactic device and have no significance in the RDF abstract syntax so equivalence is true in the abstract syntax and an RDF graph is a set of triples.  So it's simple entailment as well.

http://www.w3.org/TR/rdf-concepts/#section-Graph-syntax


But (a) and (b) are not equivalent abstract syntax nor is there an simple-entailment relationship between them by the current RDF specs.  There could be an entailment relationship (a D-entailment, not simple entailment) but it's not one mentioned in RDF semantics.  The analogy of (a)/(b) to (c)/(d) is crossing levels in the RDF spec.

This occurs in other places:

The rdf:text document says:
""" [sec 3.2]
"text"^^xs:string can be abbreviated as "text".
"""

But in fact it's based on "Datatype Entailment Rules" xsd 1a and xsd 1b so "abbreviated", which to me implies a syntactic relationship, is not right.


SPARQL examples:

Datatype("Padre de familia@es"^^rdf:text) ==> rdf:text
Datatype("Padre de familia"@es) ==> error
Datatype("Padre de familia") ==> xs:string
Lang("Padre de familia@es"^^rdf:text) ==> ""
Lang("Padre de familia"@es) ==> "es"

Whether you think that's the right design is not the issue here - it's what the spec says and is observable.

My conclusion:

The rdf:text document is proposing making a change to RDF that will cause observable differences in RDF processors depending on handling of rdf:text.  (And SPARQL but that's not the core issue - SPARQL follows RDF - although the rechartering of DAWG might say that the semantics of SPARQL-2008 can't be changed.)

Interoperability is being compromised at some level, whether good or bad is judgement call.

Currently, a literal can have a language tag or a datatype, not both.  I wonder how many systems or application have used that assumption in some way?

(one short point below)

> >
> > For simplicity of implementation, I think RDF serializations should
> > mandate use of one style of language tagging or the other.  In order
> to
> > handle legacy syntaxes which were created before rdf:text and so could
> > not pick, I think we should probably say rdf:text SHOULD NOT be used
> in
> > any RDF syntax which has built-in support for language tagging (in
> order
> > to avoid all the problems you name, below).
> >
> > That is, in RDF/XML, N-Triples, N3, and Turtle,
>
> note: the latter three are "only" member- or team submissions. a
> standard emerging from these could fix that, and likewise could the
> upcoming re-launch of DAWG (in case that there is support to add
> datatype support to SPARQL in that round).
>
> > one SHOULD NOT use
> > rdf:text.  (Happily, this aligns with rdf-syntax saying "Any other
> names
> > are not defined and SHOULD generate a warning when encountered, but
> > should otherwise behave normally.")  Meanwhile, the various RIF
> syntaxes
> > and the newer OWL syntaxes do not directly support language tagging,
> so
> > one has to use rdf:text.  Perhaps a Turtle 1.1 would remove type-a
> > language tagging and mandate rdf:text instead. Similarly, APIs are
> free
> > to pick one or the other (or some other, equivalent) approach, but
> > should probably just provide one, and certainly not distinguish
> between
> > the two.
> >
> >> Would there be any expectation on round tripping so that an RDF
> >> processor receiving a graph in form (b) would be expected to return
> it
> >> in the same form and not normalize it to form (a)?
> >
> > With the above formulation, this problem doesn't come up.
> >
> >> When we originally proposed rif:text I was expecting to translate RDF
> >> lang-tagged literals to rif:text as part of a translator and rif:text
> >> would not appear as a datatype in the RDF.
> >>
> >> Implementing rif:text as a RDF datatype is clearly possible but the
> >> discontinuity introduced by changing RDF would be a serious concern.
> We
> >> could end up in a state where some RDF producers thought form (b) was
> a
> >> legal way to exchange an internationalized text fragment in RDF while
> a
> >> fraction of deployed RDF consumers would not consume it (at least not
> >> with the required semantics).
> >
> > Agreed, this is a problem we should avoid.
> >
> >> If only RIF and RDF were involved then my preferred phrasing would be
> >> something like:
> >>
> >> "Note that the rdf:text datatype is purely intended for use within
> RIF
> >> and it is not intended that RDF processors should support this as an
> RDF
> >> datatype. Consumers of RDF-RIF combinations are expected to map
> between
> >> RDF language-tagged literals and rdf:text literals as part of the RIF
> >> translation process."
> >>
> >> Indeed it might be better still to substitute "SHOULD NOT" (in the
> >> RFC2119 sense) for "not intended".
> >>
> >> However, that doesn't cover OWL2. I don't understand enough of OWL2's
> >> requirements here, and how interoperation with deployed RDF is
> >> envisaged, to be able to suggest anything specific.
>
> I see the concerns, however, supporting it in OWL2 and not in RDF sounds
> kind of weird, assuming that OWL2 will have an RDF serialization... no?

Presumably the RDF serialization would be "foo"@en and the OWL2 processor internally handles it as "foo@en"^^rdf:text by rdf:text section 3.2.

        Andy

>
> Axel
>
> --
> Dr. Axel Polleres
> Digital Enterprise Research Institute, National University of Ireland,
> Galway
> email: axel.polleres@deri.org  url: http://www.polleres.net/
Received on Wednesday, 10 December 2008 16:43:00 UTC