Re: text "MUST replace in the graph"

Hmmm.  A nice blind-men-and-the-elephant moment.  Let's see if I can
frame this in a way we're both happy with, and maybe come up with some
new text in the process.  [Yes, I did, and it's a miniscule change to
the wiki text; skip down to "proposal #3" if you don't care about the
argument.]

We have two subsets of the set of all RDF Graphs:

 * a "plain graph" is one which contains zero or more plain literals and
   zero literals with datatype rdf:text

 * an "RT graph" contains zero or more literals with datatype rdf:text
   and zero plain literals.

Let's also define a "mixed graph" to be one which contains at least one
plain literal and at least one literal with datatype rdf:text.

For those of us who think graphically:


                 does not                 contains
                 contain                  plain
                 plain                    literal                  
                 literal


   does not       |
   contain      --|--------- plain graph ---------
   rdf:text       |
                RT graph
                  |
   contains       |                        mixed
   rdf:text       |                        graph
    

(I don't think I need to name the intersection, those graphs which
contain neither plain literals nor rdf:text literals.)

Before rdf:text came alone, all RDF graphs were plain graphs.  To my
knowledge, all existing RDF software works with plain graphs and does
not work with RT graphs.  Support for plain literals is mandated by the
RDF specs, and (as you point out) RDF parsers "SHOULD generate a
warning" [1] if they encounter an rdf:text literal.

Meanwhile, RIF and OWL 2 basically work with RT graphs.  They don't
directly deal with plain literals; instead they rely on the fact that
there's a one-to-one mapping between RT graphs and plain graphs (modulo
mixed-case language tags).  This way they can work with RT graphs (which
are basically simpler) and (using the isomorphism) still interoperate
with plain-graph systems.

The issue here is what to mandate about rdf:text such that RT-graph
systems are allowed to exist and flourish without causing any problems
for plain-graph systems, letting the two kinds of systems interoperate
smoothly.

I read the current wiki draft [2] to say that RT graphs can only exist
in private, that the only graphs one is allowed to transmit are plain
graphs.  That seems too strong to me; I'd rather say that it's okay to
transmit RT graphs as long as you're using a format designed for
transmitting RT graphs to RT systems.

In particular, it seems to me that this RIF PS excerpt is an RT graph:

  <http://www.w3.org/>[dc:title -> "World Wide Web Consortium@en-US"^^rdf:text]

... and that it would be forbidden by the rule that RT graphs can't be
transmitted.

I guess you could counter that the excerpt encodes a RIF Group which
happens to correspond to an RT graph (and also happens to correspond to
a plain graph), but that fundamentally it is a Group or Frame, not an
RDF Graph at all.  (ceci n'est pa une pipe.)

And while it's not a high priority to define new RDF serialization, I do
think that it make sense to allow them to use RT graphs instead of plain
graphs, when/if they come along.

> Sandro Hawke wrote:
> 
> > Yeah, I'm concerned about the case of people making an "RDF Syntax" that
> > you might not consider an "RDF Syntax" since it doesn't support plain
> > literals.  I think the important thing is to say that every RDF graph
> > serialization format should mandate using EITHER a special syntax for
> > plain literals OR rdf:text, but not both.
> 
> Sorry, I can't follow this.
> 
> An RDF serialization needs to have some way to represent 
> lang-tagged-plain-literals period. I don't see this as an either/or 
> situation.

As I hope I've made clear, I think it's reasonable to map from plain
graphs to RT graphs, and then work with (store/serialize/reason-with) RT
graphs.  I see RIF and OWL 2 as doing that.

You could claim "that's no longer RDF", and I'd grant you that it is
kind of cheating, but I think it's in a good way and that we can avoid
it causing any harm.

> The thing is that any alternative serialization has to preserve the RDF 
> data model and in that model lang-tagged-plain-literals do not have a 
> datatype and cannot be queried for one in SPARQL.

I could define R-Triples as being N-Triples, but having with these
productions [3] removed:

       literal          ::=     langString | datatypeString     
       langString       ::=     '"' string '"' ( '@' language )?        
       datatypeString   ::=     '"' string '"' '^^' uriref

and replaced with this:

       literal          ::=     datatypeString

and a declaration that plain->RT mapping has to be done in serialization
and RT->plain mapping has to be done in parsing, if the systems are
plain-graph systems.

I believe R-Triples would then be a serialization that preserves the RDF
data model.  I also think for RT-systems and RT-users it would make a
lot of sense.

(Of course the cost of having Yet Another RDF Syntax would outweight the
value of this particular syntax.  I'm just saying that future RDF
syntaxes shoud be allowed to take this road if they want.  My latest RDF
API does, and I rather like it.)

> If some future serialization of RDF used rdf:text in its serialization 
> of plain literals but preserved the RDF model then so be it but unless 
> it's going to change RDF then it wouldn't be serializing a typed literal 
> in that case which would seem distinctly odd.  I guess I don't see the 
> value of keeping that option open.

I take the fact that RIF and OWL 2 follow this route as some evidence of
its value.

> > And since all existing
> > formats don't give their answer to this choice (and some future ones
> > wont either), we just declare that if the format provides plain literals
> > and/or doesn't say you have to use rdf:text, then you shouldn't use
> > rdf:text in it.
> > 
> > How's this:
> > 
> >       Despite the semantic equivalence between typed rdf:text RDF
> >       literals and plain RDF literals, the presence of rdf:text literals
> >       in an RDF graph might cause interoperability problems.  For
> >       example, if an RDF graph containing rdf:text literals is
> >       serialized in RDF/XML, a system may receive it which does not
> >       implement rdf:text handling.  Such a system will typically treat
> >       these literals opaquely, storing them without processing and not
> >       matching them to plain literals.
> 
> It's not just "not matching them" it is (a) required by the RDF spec to 
> issue a warning and (b) would treat them as typed literals and thus be 
> queriable as such in SPARQL.

Noted, certainly.  I was just trying to get the idea across.  (And I'd
forgotten about the warning.)

> >       To avoid this problematic behavior, tools which support rdf:text
> >       MUST replace in the graph each rdf:text literal with the
> >       correspinding plain RDF literal before transmitting the graph to
> >       any system which is not required to implement rdf:text handling.
> >       In practice this means the replacement SHOULD be done before
> >       serializing in any RDF format which supports plain literals, since
> >       systems reading such a format will typically omit rdf:text
> >       handling.  In contrast, some formats for RDF may provide only
> >       typed literals and thus require that each plain literal be
> >       replaced by the corresponding rdf:text typed literal before
> >       transmission.
> 
> That seems to fudge the existing distinction between plain and typed 
> literals. No case has been made for changing the RDF model to only have 
> typed literals. I personally wouldn't object but only if done in the 
> context of a more extensive, community supported, RDF overhaul. 

It seems to me that RIF and OWL 2 bring us RT graphs, and that they can
co-exist (with a little care) with plain graphs.  Perhaps plain graphs
will be deprecated some day, perhaps not -- it doesn't matter as long as
interoperation is plainless.

> Improving the serialization of lang-tagged-literals is certainly a long 
> way down my own list of pieces of RDF that need working on.

Mine, too, but the RIF and OWL 2 semantics folks seemed quite intent on
having this, and since we've given it to them, we might as well make it
available to the rest of the RDF community, assuming there are
sufficient usage rules to prevent interoperation problems.  (I'm sure
neither of us wants to be spending this time on it, but it's so close to
useful....)

So....  New proposed text.  Maybe this third time will do it.  It's just
the wiki text plus a very small six words I probably could have slipped
in without anyone noticing.....       :-)

PROPOSAL #3:

     Despite the semantic equivalence between typed rdf:text RDF
     literals and plain RDF literals, the presence of typed rdf:text RDF
     literals in an RDF graph might cause interoperability problems
     between RDF tools, as not all RDF tools will support
     rdf:text. Therefore, before exchanging an RDF graph with other RDF
     tools that do not necessarily support rdf:text, an RDF tool that
     supports rdf:text MUST replace in the graph each typed rdf:text RDF
     literal with the corresponding plain RDF literal.

Having gone to such length to analyze the motivations in e-mail, I find
I no longer have any taste for doing it in the spec, and I would leave
it at that.  (I would drop the "The notion of graph exchange..."
sentence currently in the wiki.)

I'm being a little tricky with two different meanings of the word
"necessarily".  We could switch to a more mundane phrasing, like
"Therefore, before exchanging an RDF graph with other RDF tools (unless
those tools are required to support rdf:text by the exchange
protocol/format being used), an RDF tool that supports rdf:text MUST..."
(call this proposal #4)

      -- Sandro


> >>> -----Original Message-----
> >>> From: public-rdf-text-request@w3.org [mailto:public-rdf-text-request@w3.o
> rg
> >> ]
> >>> On Behalf Of Sandro Hawke
> >>> Sent: 06 April 2009 18:02
> >>> To: public-rdf-text@w3.org
> >>> Subject: rdf:text "MUST replace in the graph"
> >>>
> >>>
> >>> (This is my second substantive comment in my rdf:text review; so far all
> >>> my other comments are editorial, and I'll send them along separately.)
> >>>
> >>> I don't think the text at the end of section 4 is quite right.  It
> >>> currently says:
> >>>
> >>>      Despite the semantic equivalence between typed rdf:text RDF
> >>>      literals and plain RDF literals, the presence of typed rdf:text RDF
> >>>      literals in an RDF graph might cause interoperability problems
> >>>      between RDF tools, as not all RDF tools will support
> >>>      rdf:text. Therefore, before exchanging an RDF graph with other RDF
> >>>      tools, an RDF tool that suports rdf:text MUST replace in the graph
> >>>      each typed rdf:text RDF literal with the corresponding plain RDF
> >>>      literal. The notion of graph exchange includes, but is not limited
> >>>      to, the process of serializing an RDF graph using any (normative or
> >>>      nonnormative) RDF syntax.
> >>>
> >>> The problem with this is that it forbids use of rdf:text in interchange
> >>> in the future.  In fact, RIF can be used to interchange RDF Graphs (by
> >>> stating ground frame facts,), but it's forbidden by this text from
> >>> including internationalized strings!  More seriously, I expect new
> >>> machine formats for RDF would use rdf:text, but this forbids it.
> >>>
> >>> I think this can be fixed by adding a little phrase, which I've put in
> >>> all-caps below, just to show the change:
> >>>
> >>>      Despite the semantic equivalence between typed rdf:text RDF
> >>>      literals and plain RDF literals, the presence of typed rdf:text RDF
> >>>      literals in an RDF graph might cause interoperability problems
> >>>      between RDF tools, as not all RDF tools will support
> >>>      rdf:text. Therefore, before exchanging an RDF graph with other RDF
> >>>      tools, an RDF tool that suports rdf:text MUST replace in the graph
> >>>      each typed rdf:text RDF literal with the corresponding plain RDF
> >>> ->   literal, UNLESS THE EXCHANGE FORMAT BEING USED MANDATES THAT
> >>> ->   RECIEVERS SUPPORT RDF:TEXT.  The notion of graph exchange includes,
> >>>      but is not limited to, the process of serializing an RDF graph
> >>>      using any (normative or nonnormative) RDF syntax.
> >>>
> >>> The paragraph could be re-written to be smoother, but I think that's the
> >>> minimal change we need here.
> >>>


[1] http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#section-Namespace
[2] http://www.w3.org/2007/OWL/wiki/index.php?title=InternationalizedStringSpec&oldid=21373
[3] http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/

Received on Tuesday, 7 April 2009 00:24:02 UTC