RE: A summary of the proposal for resolving the issues with rdf:text --> Could you please check it one more time? from Boris Motik on 2009-05-20 (public-rdf-text@w3.org from April to June 2009)

From: Boris Motik <boris.motik@comlab.ox.ac.uk>
Date: Wed, 20 May 2009 17:52:45 +0200
To: "'Eric Prud'hommeaux'" <eric@w3.org>
Cc: "'Alan Ruttenberg'" <alanruttenberg@gmail.com>, "'Seaborne, Andy'" <andy.seaborne@hp.com>, <public-rdf-text@w3.org>, "'Sandro Hawke'" <sandro@w3.org>, "'Axel Polleres'" <axel.polleres@deri.org>
Message-ID: <2785F15A6EE7419AB8F917325AE65D86@wolf>
To me, all this seems like hacking. We are now trying to make some assumptions
about how various implementations might/will/should treat rdf:text literals in
order to achieve certain goals. This goes completely against the grain of the
idea of declarativeness and as such it mixes various concerns without a clear
separation boundary.

Also, even if we were to do some hacking, I don't see what it is that we'd need
to do there. The document currently contains a requirement to normalize these
literals during graph exchange. I'm fine with leaving it in. Again, if that's
what is needed, then we're done, aren't we?

Boris

> -----Original Message-----
> From: Eric Prud'hommeaux [mailto:eric@w3.org]
> Sent: 20 May 2009 17:47
> To: Boris Motik
> Cc: 'Alan Ruttenberg'; 'Seaborne, Andy'; public-rdf-text@w3.org; 'Sandro
> Hawke'; 'Axel Polleres'
> Subject: Re: A summary of the proposal for resolving the issues with rdf:text
> --> Could you please check it one more time?
> 
> On Wed, May 20, 2009 at 05:23:01PM +0200, Boris Motik wrote:
> > I honestly don't see how this comment is relevant to the present discussion.
> 
> I think it brought us from "it's a SPARQL problem" to "it's a problem
> for any app which doesn't support certain entailments".
> 
> 
> > Each device that does something with RDF needs to decide what it wants to do
> > with it. Here are its options:
> >
> > 1. It can choose to use D-entailment. Well, if this is what the device wants
> to
> > do, then I'm afraid it needs to implement D-entailment. rdf:text is no
> different
> > from xsd:integer and xsd:decimal in that respect.
> >
> > 2. It can choose to use simple entailment. Well, if this is what the device
> > wants to do, then it should just do it. rdf:text is no different from
> > xsd:integer and xsd:decimal in that respect either.
> >
> > 3. It can choose to do whatever it wants. Well, the precise definitions of
> > rdf:text don't matter anyway then.
> >
> > Nothing in our specification requires an implementation to select 1, 2, or
> 3.
> > Hence, this issue is completely orthogonal to rdf:text.
> 
> At issue here is whether you want the inferences allowed by the
> rdf:text specification to be available to to tools which do not
> implement rdf:text-entailment. I argue that this is an achievable
> goal as the code required to implement rdf:text entailment (either
> "bob"@en => "bob@en"^^rdf:text or "bob@en"^^rdf:text => "bob"@en
> can be pushed into the machinery which performs these inferences.
> 
> You can still use existing machinery to infer literals of type
> rdf:text so long at as that graph never interacts with any other
> graph (it's not detectably RDF from the outside world). The likely
> implementation for e.g. OWLIM or Oracle's Jena Adaptor would be
> to trigger restrictions of type rdf:text on plain literals, and
> record rdf:text inferences as plain literals.
> 
> 
> > Regards,
> >
> > 	Boris
> >
> >
> > > -----Original Message-----
> > > From: Eric Prud'hommeaux [mailto:eric@w3.org]
> > > Sent: 20 May 2009 17:19
> > > To: Boris Motik
> > > Cc: 'Alan Ruttenberg'; 'Seaborne, Andy'; public-rdf-text@w3.org; 'Sandro
> > > Hawke'; 'Axel Polleres'
> > > Subject: Re: A summary of the proposal for resolving the issues with
> rdf:text
> > > --> Could you please check it one more time?
> > >
> > > On Wed, May 20, 2009 at 03:39:29PM +0200, Boris Motik wrote:
> > > > Hello,
> > > >
> > > > This is a purely SPARQL problem: SPARQL should specify precisely what
> the
> > > > semantics of BGPs under the D-entailment regime is.
> > >
> > > Why would SPARQL be the only device ever be used to access an RDF graph?
> > >
> > > Do you favor that every device used to access and RDF graph implement
> > > all of D-entailment and rdf:text and anything like it that comes along?
> > >
> > > In that case, aren't you demanding any deployed code which may be used
> > > to access an RDF graph (Oracle, Jena, Sesame, Virtuoso, Allegrograph,
> > > ...), through query or graph API or parsed directly into an application
> > > data structure, be updated to support various entailments?
> > >
> > >
> > > > I am just going to briefly speculate as to how this might be done. I
> > > strongly
> > > > believe this should be done declaratively -- that is, without taking
> into
> > > > account implementations. Hence, one might use the following definition:
> > > >
> > > >     Given an RDF graph G and a BBP Q, a substitution s for variables in
> Q is
> > > >     an answer to G and Q iff G D-entails s(Q).
> > > >
> > > > Take the following example:
> > > >
> > > > G = { <a, b, "01"^^xsd:integer> }
> > > > Q = { <a, b, ?x> }
> > > >
> > > > Then, the following substitutions are answers to Q over G:
> > > >
> > > > s1 = { ?x --> "1"^^xsd:integer }
> > > > s2 = { ?x --> "01"^^xsd:integer }
> > > > s3 = { ?x --> "1"^^xsd:decimal }
> > > > s4 = { ?x --> "001.000"^^xsd:decimal }
> > > > etc.
> > > >
> > > > Clearly, such a definition is not practical, so the SPARQL WG should
> think
> > > of a
> > > > solution. One possible solution would be to say that each literal needs
> to
> > > be
> > > > normalized; in this case, one would return only s1 as a result. There
> are
> > > > clearly other possibilities as well, so I will stop speculating here.
> > > >
> > > >
> > > > This is a purely SPARQL problem, and not that of RDF, XML Schema, or
> > > rdf:text.
> > > > SPARQL can define answers to such queries however it wants; probably the
> > > only
> > > > constraint is that the answers should be sound w.r.t. D-entailment.
> > > Furthermore,
> > > > we should clearly separate concerns here. I see the stack of
> specifications
> > > as
> > > > follows:
> > > >
> > > > 1. RDF defines the notion of D-entailment from an RDF graph. For this,
> you
> > > need
> > > > to have a lexical space, a value space, and a lexical-to-value mapping
> for
> > > each
> > > > datatype you are using.
> > > >
> > > > 2. Various datatypes provide these, and thus define the D-consequences
> of an
> > > RDF
> > > > graph. As the above example shows, there can be many consequences, but
> > > that's
> > > > already a problem with basic XML Schema datatypes.
> > > >
> > > > 3. By relying on the definitions of D-entailment in RDF and the
> datatypes,
> > > > SPARQL has to find a way to make some sense of the examples such as the
> one
> > > > given above. This definition should probably be independent from the
> actual
> > > set
> > > > of datatypes (because people may and will add their own datatypes).
> > > >
> > > >
> > > > rdf:text resides at level 2 of this stack and is therefore completely
> > > > independent of the SPARQL questions. Furthermore, as the above example
> > > shows,
> > > > these questions exist even without rdf:text. Therefore, I believe the
> > > rdf:text
> > > > WG has done its job.
> > > >
> > > > Regards,
> > > >
> > > > 	Boris
> > > >
> > > > > -----Original Message-----
> > > > > From: Alan Ruttenberg [mailto:alanruttenberg@gmail.com]
> > > > > Sent: 20 May 2009 14:05
> > > > > To: Boris Motik
> > > > > Cc: Eric Prud'hommeaux; Seaborne, Andy; public-rdf-text@w3.org; Sandro
> > > Hawke;
> > > > > Axel Polleres
> > > > > Subject: Re: A summary of the proposal for resolving the issues with
> > > rdf:text
> > > > > --> Could you please check it one more time?
> > > > >
> > > > > Hello Boris,
> > > > >
> > > > > In what forum do you suggest this be addressed?
> > > > >
> > > > > -Alan
> > > > >
> > > > > On Wed, May 20, 2009 at 7:38 AM, Boris Motik
> > > > > <boris.motik@comlab.ox.ac.uk> wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I fully appreciate use case and I agree with your observation: this
> is
> > > > > something
> > > > > > that has to be addressed. I don't think, however, that solving this
> > > problem
> > > > > is
> > > > > > in the domain of rdf:text. The rdf:text specification merely defines
> yet
> > > > > another
> > > > > > datatype by specifying it in exactly the same way as this is done in
> XML
> > > > > Schema.
> > > > > > This datatype is just like any other XML Schema datatype; hence, the
> job
> > > > > from
> > > > > > rdf:text's point of view is done.
> > > > > >
> > > > > > Furthermore, the addition of rdf:text to the mix of the supported
> > > datatypes
> > > > > adds
> > > > > > no new conceptual problems to SPARQL: the situation with rdf:text is
> no
> > > > > > different than with, say, xsd:integer (there are other examples as
> > > well).
> > > > > For
> > > > > > example, assume that you have an RDF graph
> > > > > >
> > > > > > G = { <a, b, "1"^xsd:integer> }
> > > > > >
> > > > > > but you ask the query
> > > > > >
> > > > > > Q = { <a, b, "1.0"^^xsd:decimal> }.
> > > > > >
> > > > > > Clearly, G D-entails Q, so Q should be answered as TRUE in G. It is
> not
> > > the
> > > > > > business of XML Schema to specify how this is to be achieved: XML
> Schema
> > > > > merely
> > > > > > specifies what the correct answer to the above question is. It is a
> > > SPARQL
> > > > > > implementation such as OWLIM that should think of how to support
> such a
> > > > > > definition.
> > > > > >
> > > > > > I don't know whether a solution to the above problem (with
> xsd:integer
> > > and
> > > > > > xsd:decimal) exists. If not, I agree that one should be developed;
> > > however,
> > > > > we
> > > > > > would not go to the XML Schema WG and ask them to specify how should
> > > SPARQL
> > > > > > handle this case, would we?
> > > > > >
> > > > > > The problem with rdf:text is *precisely* the same as the one that I
> > > outlined
> > > > > > above. At an abstract level, it can be stated as "Several syntactic
> > > forms of
> > > > > > literals get mapped to the semantically identical data values". AS
> > > > > demonstrated
> > > > > > above, this problem exists without rdf:text, so I don't see how
> rdf:text
> > > > > brings
> > > > > > anything new into the whole picture. Thus, you can apply to the
> rdf:text
> > > > > case
> > > > > > exactly the same solution that you would apply to xsd:integer and
> > > > > xsd:decimal.
> > > > > > If such a solution doesn't exist yet, then the SPARQL WG should
> address
> > > > > these
> > > > > > issues, and it should do so in general for all datatypes
> (xsd:integer,
> > > > > > xsd:decimal, and so on), not just for rdf:text.
> > > > > >
> > > > > > To summarize, I think that the work from the point of view of the
> > > rdf:text
> > > > > WG is
> > > > > > *done* and that we should not do anything else in this forum.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > >        Boris
> > > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Eric Prud'hommeaux [mailto:eric@w3.org]
> > > > > >> Sent: 20 May 2009 13:18
> > > > > >> To: Boris Motik
> > > > > >> Cc: 'Seaborne, Andy'; 'Alan Ruttenberg'; public-rdf-text@w3.org;
> > > 'Sandro
> > > > > >> Hawke'; 'Axel Polleres'
> > > > > >> Subject: Re: A summary of the proposal for resolving the issues
> with
> > > > > rdf:text
> > > > > >> --> Could you please check it one more time?
> > > > > >>
> > > > > >> On Wed, May 20, 2009 at 09:29:00AM +0200, Boris Motik wrote:
> > > > > >> > Hello,
> > > > > >> >
> > > > > >> > I don't see the benefit of option 1, as it makes things
> unnecessarily
> > > > > >> complex.
> > > > > >> > The fewer exceptions we have, the easier it will be to actually
> > > implement
> > > > > a
> > > > > >> > conformant system. The dichotomy between plain und typed literals
> is
> > > just
> > > > > an
> > > > > >> > example of an exception that just makes implementation difficult.
> > > Instead
> > > > > of
> > > > > >> > introducing more special cases, I think we should unify these
> > > whenever
> > > > > >> possible.
> > > > > >> >
> > > > > >> > Furthermore, I'm not sure whether sorting out things such as the
> ones
> > > > > >> pointed
> > > > > >> > out below is necessary to finalize the rdf:text specification.
> Please
> > > > > note
> > > > > >> that
> > > > > >> > rdf:text already has a well-defined lexical and value space, and
> this
> > > is
> > > > > >> *the
> > > > > >> > only* thing that we need to be able to plug rdf:text into the
> model
> > > > > theory
> > > > > >> of
> > > > > >> > RDF. That is, given RDF graphs G1 and G2 possibly containing
> rdf:text
> > > > > >> literals
> > > > > >> > and/or plain literals, using the definitions from the present
> > > rdf:text
> > > > > >> > specification one can unambiguously answer the question whether
> G1 D-
> > > > > entails
> > > > > >> G2.
> > > > > >> > For example, if G1 is
> > > > > >> >
> > > > > >> > <a, b, "abc@en"^^rdf:text>
> > > > > >> >
> > > > > >> > and G2 is
> > > > > >> >
> > > > > >> > <a, b, "abc"@en>
> > > > > >> >
> > > > > >> > then, according to the existing RDF model theory document, G1 D-
> > > entails
> > > > > G2
> > > > > >> and
> > > > > >> > vice versa. I don't see what else is there for the rdf:text
> > > specification
> > > > > to
> > > > > >> do:
> > > > > >> > I really think that the specification is complete. If SPARQL or
> other
> > > > > >> > specifications want to apply rdf:text in a different way and
> create
> > > > > special
> > > > > >> > cases, they are free to do so; however, I don't think it is in
> scope
> > > of
> > > > > the
> > > > > >> > rdf:text specification to solve all such problems.
> > > > > >>
> > > > > >> (Hesitantly re-stating use case), consider the use case of the
> OWLIM
> > > > > >> plugin for Sesame. If OWLIM forward chains some triples into the
> > > > > >> Sesame repository with objects like "bob"@en, existing SPARQL
> queries
> > > > > >> on the existing Sesame engine will match them as expected. RIF
> rules
> > > > > >> can consume those triples and know that any rules applying to a
> domain
> > > > > >> of rdf:text apply.
> > > > > >>
> > > > > >> Constrast that with an OWLIM which emits triples with objects like
> > > > > >> "bob@en"^^rdf:text . These triples will not match conventional
> queries
> > > > > >> intended to discover e.g. all the folks named "Bob". The Sesame
> SPARQL
> > > > > >> implementation can be extended, but then we are in Pat's scenario
> of
> > > > > >> fixing RDF by visiting all the deployed code.
> > > > > >>
> > > > > >> I expect that any design of rdf:text would have it reacting to
> plain
> > > > > >> literals as if they had a datatype of rdf:text and the appropriate
> > > > > >> lexical transformation. I propose that the simplest complete design
> is
> > > > > >> one where the inference of rdf:text objects results in their
> > > > > >> expression as plain literals, avoiding a dualism between
> > > > > >> "bob@en"^^rdf:text and "bob"@en which would lose interroperability
> > > > > >> with existing queries, graph APIs, XPaths operating on SPARQL
> Results,
> > > > > >> non-OWL inferencing systems, ...
> > > > > >>
> > > > > >>
> > > > > >> > Regards,
> > > > > >> >
> > > > > >> >     Boris
> > > > > >> >
> > > > > >> > > -----Original Message-----
> > > > > >> > > From: public-rdf-text-request@w3.org [mailto:public-rdf-text-
> > > > > >> request@w3.org]
> > > > > >> > > On Behalf Of Eric Prud'hommeaux
> > > > > >> > > Sent: 20 May 2009 03:18
> > > > > >> > > To: Seaborne, Andy
> > > > > >> > > Cc: Alan Ruttenberg; public-rdf-text@w3.org; Boris Motik;
> Sandro
> > > Hawke;
> > > > > >> Axel
> > > > > >> > > Polleres
> > > > > >> > > Subject: Re: A summary of the proposal for resolving the issues
> > > with
> > > > > >> rdf:text
> > > > > >> > > --> Could you please check it one more time?
> > > > > >> > >
> > > > > >> > > On Tue, May 19, 2009 at 03:57:11PM +0000, Seaborne, Andy wrote:
> > > > > >> > > > Apologies:
> > > > > >> > > >
> > > > > >> > > > > On Fri, May 15, 2009 at 11:50 AM, Seaborne, Andy
> > > > > >> <andy.seaborne@hp.com>
> > > > > >> > > wrote:
> > > > > >> > > > >> Monday PM end before 18:00 (GMT+1)
> > > > > >> > > > >> Thursday PM.
> > > > > >> > > > >> Tuesday @17:00 (GMT+1) for a short call; end before 17:30.
> > > > > >> > > >
> > > > > >> > > > I can't make the slot.
> > > > > >> > > >
> > > > > >> > > > Input: please consider interoperability of data between OWL
> and
> > > RDF.
> > > > > >> Option
> > > > > >> > > 1 is better for that than option 2 as Eric points out.
> > > > > >> > > >
> > > > > >> > > > This is also the least change to LC and IMHO is not a
> substantive
> > > > > change
> > > > > >> (it
> > > > > >> > > follows on from the current graph exchange intent) to add the
> text
> > > > > needed
> > > > > >> for
> > > > > >> > > SPARQL.  Roughly: the scoping graph of an rdf-text aware D-
> > > entailment
> > > > > for
> > > > > >> BGP
> > > > > >> > > matching includes the RDF forms and does not include
> ^^rdf:text.
> > >  (Non-
> > > > > >> aware
> > > > > >> > > entailment regimes would merely treat as a datatype form.)
> > > > > >> > >
> > > > > >> > > does anyone oppose option 1 (plain literals are considered to
> > > satisfy
> > > > > >> > > entailments constrained to type rdf:text and entailments of
> type
> > > > > rdf:text
> > > > > >> are
> > > > > >> > > expressed as plain literals in the RDF graph)? (i'm wondering
> if we
> > > can
> > > > > >> work
> > > > > >> > > this out before we work out scheduling this phone call.)
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > >         Andy
> > > > > >> > > >
> > > > > >> > > > > -----Original Message-----
> > > > > >> > > > > From: Alan Ruttenberg [mailto:alanruttenberg@gmail.com]
> > > > > >> > > > > Sent: 19 May 2009 16:01
> > > > > >> > > > > To: Axel Polleres
> > > > > >> > > > > Cc: Seaborne, Andy; public-rdf-text@w3.org; Boris Motik;
> Sandro
> > > > > Hawke;
> > > > > >> > > > > eric@w3.orf
> > > > > >> > > > > Subject: Re: A summary of the proposal for resolving the
> issues
> > > > > with
> > > > > >> > > > > rdf:text --> Could you please check it one more time?
> > > > > >> > > > >
> > > > > >> > > > > On Mon, May 18, 2009 at 10:03 AM, Axel Polleres
> > > > > >> <axel.polleres@deri.org>
> > > > > >> > > > > wrote:
> > > > > >> > > > > > Alan, since you were calling for the TC, is that fixed
> now?
> > > > > >> > > > > > Otherwise, I am afraid it is not possible before Friday.
> > > > > >> > > > >
> > > > > >> > > > > Yes, let's have whoever can make it meet at 5:30 BST =
> 12:30
> > > Boston
> > > > > >> > > > > time.
> > > > > >> > > > > Zakim, meet on irc #rdftext for the code. I will send a
> code
> > > > > earlier
> > > > > >> if
> > > > > >> > > > > I can.
> > > > > >> > > > >
> > > > > >> > > > > -Alan
> > > > > >> > >
> > > > > >>
> > > > > >>
> > > > > >> office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
> > > > > >> mobile: +1.617.599.3509
> > > > > >>
> > > > > >> (eric@w3.org)
> > > > > >> Feel free to forward this message to any list for any purpose other
> > > than
> > > > > >> email address distribution.
> > > > > >
> > > > > >
> > >
> 
> --
> -eric
> 
> office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
> mobile: +1.617.599.3509
> 
> (eric@w3.org)
> Feel free to forward this message to any list for any purpose other than
> email address distribution.
Received on Wednesday, 20 May 2009 15:54:19 UTC