Re: CONSTRUCTing illegal triples should be optional from David Booth on 2012-07-31 (public-rdf-dawg-comments@w3.org from July 2012)

From: David Booth <david@dbooth.org>
Date: Tue, 31 Jul 2012 15:41:29 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg-comments <public-rdf-dawg-comments@w3.org>
Message-ID: <1343763689.2725.77359.camel@dbooth-laptop>
On Tue, 2012-07-31 at 18:39 +0100, Andy Seaborne wrote:
> 
> On 31/07/12 18:00, David Booth wrote:
> > Hi Andy,
> >
> > Thanks for your response.  I am not satisfied with this resolution.
> > I see no harm that would be created by the simple wording change that
> > I proposed -- NO implementation would have to change -- and I do see
> > harm in the current wording.
> 
> An implementation of an RDF system capable of receiving results from a
> SPARQL 1.0 engine would have to change to work with the same query in
> the loose SPARQL 1.1 implementation.  It would require at least a new 
> and specialised parser.

No it wouldn't.  SPARQL 1.0 engines are not currently required to accept
malformed RDF, nor are they required to accept RDF with malformed
xsd:datetimes.  This would not change.  They could still reject such
malformed input.

> 
> SPARQL 1.1 charter:
> [[
> All queries, that are valid in the January 2008 version of SPARQL,
> should remain valid in the new version and should produce identical
> results, except in the case of errata.
> ]]

That's a good point.  I always viewed an attempt to generate a malformed
triple as a user error.  But if SPARQL 1.0 already required every
implementation to compensate for those "user errors", and if people had
already been writing queries that expected this cleanup, then this would
represent a significant change.  It hadn't occurred to me that people
would already be depending on such "auto cleanup" semantics, but Lee
(off list) mentioned that he does, as one data point.

> 
> Nothing stops an implementation emitting what it likes - but don't call
> it SPARQL or RDF.

Right, it would not be RDF if it is malformed.

> 
> Elsewhere you argue that the SPARQL specification needs to exactly
> define the behaviour of systems claiming compliance with the standard.
> How do you reconcile these two positions?

I was looking at this as a case in which the user was explicitly asking
for results *outside* of what the spec could standardize.  But as I
explained above, I did not realize that people might already be
depending on such "auto cleanup" semantics, and this definitely changes
the situation.

Thanks Andy and Lee for your explanations.  I think my only remaining
concern about this is that it still tightly couples the RDF spec with
the SPARQL spec.  A major objection to permitting literals as subjects
in RDF is that it would break existing software, and baking this
prohibition into the SPARQL spec makes it harder to ever change that, as
it becomes a self-reinforcing cycle.

Anyway, based on the above new information, I'll revoke my objection.  I
am satisfied with this resolution.

Thanks!
David

> 
>  Andy
> 
> > This is similar to many other situations in which the specification
> > should not force an implementation to perform strict validity
> > checking that the user may not want or need, such as ensuring that
> > all dynamically generated URIs are strictly conforming, all dates are
> > valid, etc.  Strict validity checking is definitely a nice optional
> > value-add for implementations that choose to provide it, when users
> > want it.  But the specification (rightly) does not force every
> > implementation to perform strict validity checking, and shouldn't in
> > this case either.
>  >
> > The SPARQL spec cannot standardize exactly what should be output if
> > the user CONSTRUCTs a triple having a literal as subject, but it
> > doesn't need to: the result can be implementation defined, just as
> > it is in other situations in which the user does something illegal.
> > This has been a standard approach in programming language design for
> > decades: if the user *chooses* to do something illegal, then the
> > results are implementation defined.
> 
> > David
> >
> > On Tue, 2012-07-31 at 09:06 +0100, Andy Seaborne wrote:
> >> David,
> >>
> >> Thank you for your comment about CONSTRUCT.
> >>
> >> SPARQL is defined to work with RDF and CONSTRUCT defined to create
> >> RDF graphs by receiving an HTTP-carried request and returning the
> >> results using one of the RDF concrete syntaxes.  The SPARQL
> >> specification does not say anything about API use.  It is not in
> >> the charter of the working group to define a new data framework
> >> going beyond RDF.  SPARQL builds on the work of the RDF working
> >> group.
> >>
> >> If an implementation wishes to go beyond the specification in some
> >> way, such as allowing API use to create other forms of triples, it
> >> is at liberty to do so, accepting responsibility for
> >> interoperability issues this raises.  Interoperability is an
> >> important aspect for a web system.
> >>
> >> The working group is not planning to make any changes in this
> >> area.
> >>
> >> I would be grateful if you reply to this message to confirm that
> >> the working group has responded to your comment.
> >>
> >> Yours, on behalf of the SPARQL Working Group,
> >>
> >> Andy
> >>
> >>
> >> On 20/07/12 17:18, David Booth wrote:
> >>> Regarding this: http://www.w3.org/TR/sparql11-query/#construct
> >>> [[ If any such instantiation produces a triple containing an
> >>> unbound variable or an illegal RDF construct, such as a literal
> >>> in subject or predicate position, then that triple is not
> >>> included in the output RDF graph. ]]
> >>>
> >>> This really bothers me, because: (a) it unnecessarily couples
> >>> SPARQL to a controversial decision in the RDF WG that may well
> >>> change in the future, i.e., the prohibition against literals as
> >>> subjects; and (b) it forces a conforming implementation to
> >>> perform checks that its user may not want or need.
> >>>
> >>> If a user chooses to generate invalid RDF then that is his/her
> >>> business. The SPARQL spec should not prohibit it.  If a
> >>> particular implementation offers the feature of performing this
> >>> check, then that is fine.  But it is unnecessarily draconian to
> >>> require all implementations to do it.
> >>>
> >>> I suggest changing the above to: [[ If any such instantiation
> >>> produces a triple containing an unbound variable then that triple
> >>> MUST NOT be included in the output RDF graph. Otherwise, if any
> >>> such instantiation produces a triple containing any illegal RDF
> >>> construct, such as a literal in subject or predicate position,
> >>> then that triple MAY be excluded from the output RDF graph. ]]
> >>>
> >>>
> >>
> >>
> >
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Tuesday, 31 July 2012 19:42:02 UTC