[RESOLVED] Re: SPARQL WG comments on rdf:text

On Wed, May 06, 2009 at 04:35:22PM -0400, Lee Feigenbaum wrote:
> Hello OWL and RIF working groups,
> 
> The SPARQL WG has reviewed the rdf:text Last Call document on our 
> mailing list[1], in a teleconference [2], and today at our face-to-face 
> meeting [3].
> 
> The group resolved to send the following comments. At this time, we do 
> not have proposed spec text to resolve these comments, but would be glad 
> to consult on possibilities.
> 
> The comment is at 
> http://www.w3.org/2009/sparql/wiki/index.php?title=Rdf_text_LC_WG_comment&oldid=758 
> and is reproduced here for your convenience.

These comments have been addressed to the satisfaction of the SPARQL
WG. We are content that document states that *the* form of a term of
type rdf:PlainLiteral in any RDF graph is as an rdf plain literal per
  http://www.w3.org/TR/rdf-concepts/#dfn-plain-literal
and thus, the behavoir of these terms within existing RDF applications
such as SPARQL is well-defined.

✓

> ~~~
> Summary
> 
> SPARQL queries act on the graph, not on the serialized form. Thus, we 
> suggest to the editors state the interactions with SPARQL in respect to:
> 
>    1. the restriction to rdf:text not appearing in RDF graphs should be 
> extended such that rdf:text MUST NOT appear in SPARQL XML results. This 
> extends the existing coverage of RDF graph exchange to include SPARQL 
> results from SELECT, in the same way that CONSTRUCT and DESCRIBE queries 
> are already covered.
>    2. the use of "semantic equivalence" shall be clarified and it 
> should be noted that rdf:text is a D-entailment and is accessed by 
> SPARQL via a BGP entailment regime extension.
>    3. that functions STR/DATATYPE/LANG act on the lexical 
> representations and will be affected depending on the way an rdf:text 
> aware entailment regime manifests it's results.
> 
> In addition it should be noted that rdf:text relates to the assumption 
> in RDF that a literal has a datatype or a language tag but not both. 
> Existing, deployed code relies on this invariant.
> [edit] Overview
> 
> There are some SPARQL-specific issues that arise that are not addressed 
> in the document. The rdf:text only refers to "graph exchange" when 
> saying that rdf:text must not appear in RDF graphs serializations but 
> that does not apply to SPARQL directly.
> 
> Because rdf:text document says nothing about SPARQL operations and it's 
> not clear to me whether changes to existing SPARQL queries are being 
> assumed. At one time, they were.
> 
> Since SPARQL is defined over simple entailment, NOT datatype entailment, 
> the notion of "semantic equivalence" (mentioned but not defined in the 
> rdf:text document) does not make sense and this spec appears to require 
> changes to SPARQL behaviour. This would be undesirable since it affects:
> 
> 1. SPARQL Query Result XML Format
> 
> 2. Interactions with simple entailment matching of BGPs, and extension 
> of SPARQL via BGPs.
> 
> 3. Effects on DATATYPE, LANG and STR
> 
> Note: In RDF, a literal has either a language tag or a datatype but not 
> both. rdf:text changes this assumption so deployed code or SPARQL 
> implementations that rely on this invariant may break.
> 
> We believe that these concerns can be remedied, if rdf:text talks about 
> D-entailment specifically, instead of "semantic equivalence" (and thus 
> not affecting simple entailment as well) in general.
> [edit] SPARQL XML Results Format
> 
> This is not "graph exchange" so the prohibition use of rdf:text in a 
> serialization does not apply. It could be applied, but might not help 
> systems that do want to see rdf:text literals, for example, SPARQL/OWL2.
> 
> The problem here, again, is that the semantic implications of rdf:text 
> are not forward-compatible with existing RDF. This concern would be 
> remedied by defining the semantic implications of rdf:text in terms of 
> D-entailment only, as suggested above. In fact, we think that this fix 
> makes the restrictions of the usage of rdf:text in RDF graphs redundant.
> [edit] Datatype Property
> 
> What happens if a datatype property is restricted to a rdf:text? What 
> does the RDF serialization look like? Does it include rdf:text?
> [edit] BGP matching
> 
> The SPARQL standard defines SPARQL with respect to simple entailment and 
> provides a mechanism for extension to other entailment regimes. See the 
> section "12.6 Extending SPARQL Basic Graph Matching".
> 
> Since SPARQL is defined over simple entailment, NOT datatype entailment, 
> the notion of "semantic equivalence" (mentioned but not defined in the 
> rdf:text document) does not make sense. SPARQL is not acting on the 
> serialization of an RDF graph. It acts on the value space of literals.
> 
> Simple entailment does not cover the RDF-MT entailments xsd1a and xsd1b, 
> which are the rules for plain literals without language tag being the 
> same value as XSD strings. So these are not required of a SPARQL 
> processor using simple entailment.
> 
> Additional semantic equivalences implied by rdf:text should only affect 
> D-entailment (where rdf:text is part of the datatype map D following 
> [1]) but not simple entailment. Thus, the document should not talk about 
> "semantic equivalence" in general terms but just in terms of 
> D-entailment. This should fix the main problem raised and would only 
> affect SPARQL engines that follow a (yet to be defined).
> 
> We suggest that it is explicitly noted that access to rdf:text aware 
> entailment regimes by a SPARQL query is via the extension mechanism.
> [edit] Effects on DATATYPE, LANG and STR
> 
> Noting that this SPARQL-WG should maintain compatibility with SPARQL as 
> published Jan 2008.
> 
> These functions are accessors to the components of a literal term. 
> Different ways of manifesting a value from BGP matching will lead to 
> different resutlts from these functions.
> 
> For these example, the serialized form using rdf:text is used although 
> in an RDF graph it exists as a value and when the graph is serialised 
> rdf:text does not appear. The examples relate to a variable bound to 
> such a value and how the literal accessor function (DATATYPE, LANG and 
> STR) of SPARQL can be impacted.
> 
> rdf:text does define some functions on rdf:text.
> 
> DATATYPE is defined so that the type of a plain literal without language 
> tag is xsd string. There is no datatype for a literal with language.
> 
> SPARQL has the concept of a "simple literal" for a plain literal without 
> language tag.
> 
> These functions are applied as part of the algebra, not as part of BGP 
> matching - the entailment extension mechanism does not modify these 
> functions. There may be different entailment regimes, maybe on different 
> graphs, in the same query.
> [edit] DATATYPE
> 
> DATATYPE of a literal with language tag
> 
> SPARQL/2008:
> 
>  DATATYPE ("Padre de familia"@es) ==> error
> 
> When a literal is bound to a variable and subsequently used in a call to 
> DATATYPE, what return value is expected? Is it true that if instead it 
> is presented as below, a different result is obtained?
> 
>  DATATYPE("Padre de familia@es"^^rdf:text) ==> rdf:text
> 
> Similarly:
> 
> SPARQL/2008 defines:
> 
>  DATATYPE ("Padre de familia") ==> xs:string
> 
> but what is:
> 
>  DATATYPE ("Padre de familia") ==> rdf:text ?? xs:string ??
> 
> because one value space is a subset of the other.
> 
> The reason for rdf:text is the uniform treatment of literals so the 
> query to find all the untyped literals ("untyped" meaning as per the 
> current SPARQL REC - without type - simple literal or literal with 
> language tag) might be changed.
> [edit] LANG
> 
> In RDF, a literal has either a language tag or a datatype but not both. So:
> 
> SPARQL/2008:
> 
>  Lang("Padre de familia"@es) ==> "es"
> 
> but
> 
>  Lang("Padre de familia@es"^^rdf:text) ==> ""
> 
> rdf:text:
> 
>  Lang("Padre de familia@es"^^rdf:text) ==> ??
> 
> c.f. rtfn:lang-from-text(Padre de familia@es"^^rdf:text) ==> "es"
> [edit] STR
> 
> rdf:text is a datatype with lexical space including the language tag
> 
> SPARQL/2008 defines:
> 
>  STR("Padre de familia@es"^^rdf:text) ==> "Padre de familia@es"
>  STR("Padre de familia"@es) ==> "Padre de familia"
> 
> rdf:text:
> 
>  STR("Padre de familia@es"^^rdf:text) ==> "Padre de familia" ??
> 
> because STR returns the lexical form.
> 
> The lexical space of literals with language tags is changed by rdf:text.
> [edit] FILTERs
> 
> SPARQL FILTERs evaluate to an effective boolean value (defined in XQuery 
> "2.4.3 Effective Boolean Value" and referenced by SPARQL "11.2.2 
> Effective Boolean Value (EBV)".
> 
> The EBV of a string is false if the string is of length zero else true.
> 
> Do any rdf:text literals have an EBV of false?
> 
> 
> [edit] Intra-spec Compatibility
> [edit] IRIs vs. URIs
> 
> "This specification uses Uniform Resource Identifiers (URIs) for naming 
> datatypes and their components" indicates that language tags in RDF are 
> URIs, where SPARQL Query interpreted them as IRIs. Using URIs would 
> imply that
> 
> <X> <p> 
> <http://xn--9oqp94l.example/?user=%D8%A3%D9%83%D8%B1%D9%85&channel=R%26D> .
> 
> would be matched by the SPARQL graph pattern
> 
> <X> <p> <http://伝言.example/?user=أكرم&channel=R&D> .
> 
> [edit] References
> 
> 1. http://www.w3.org/TR/rdf-mt/#dtype_interp
> 
> 2. http://www.w3.org/TR/rdf-sparql-query/#sparqlBGPExtend
> 
> 3. http://lists.w3.org/Archives/Public/public-rdf-text/2008OctDec/0036.html
> ~~~
> 
> 
> 
> Lee
> on behalf of the SPARQL WG
> 
> [1] http://lists.w3.org/Archives/Public/public-rdf-dawg/2009AprJun/0107.html
> [2] http://www.w3.org/2009/sparql/meeting/2009-04-28#rdf__3a_text
> [3] raw IRC log: http://www.w3.org/2009/05/06-sparql-irc

-- 
-eric

office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Wednesday, 3 June 2009 12:22:46 UTC