- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Wed, 4 May 2011 16:01:26 -0400
- To: Lee Feigenbaum <lee@thefigtrees.net>
- Cc: Alex Hall <alexhall@revelytix.com>, Pat Hayes <phayes@ihmc.us>, Antoine Zimmermann <antoine.zimmermann@insa-lyon.fr>, public-rdf-wg <public-rdf-wg@w3.org>
* Lee Feigenbaum <lee@thefigtrees.net> [2011-05-04 14:43-0400] > On 5/4/2011 2:29 PM, Eric Prud'hommeaux wrote: > >* Alex Hall<alexhall@revelytix.com> [2011-05-04 14:08-0400] > >>On Wed, May 4, 2011 at 1:36 PM, Lee Feigenbaum<lee@thefigtrees.net> wrote: > >> > >>>On 5/4/2011 1:17 PM, Pat Hayes wrote: > >>> > >>>> > >>>>On May 4, 2011, at 9:08 AM, Lee Feigenbaum wrote: > >>>> > >>>> I'd like to understand if the proposed resolution of this issue is > >>>>>("merely") a recommendation, or is a change to RDF syntactic equality. In > >>>>>particular, will we be changing > >>>>>http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality such that > >>>>>"foo" and "foo"^^xsd:string are equal literals? > >>>>> > >>>>>Looking at this through SPARQL's eyes (as I am wont to do), one of the > >>>>>goals of this change is so that I can write: > >>>>> > >>>>>SELECT ... { ?s :p "foo" } > >>>>> > >>>>>and have that match whether the data that was loaded into the store was > >>>>>"foo" or "foo"^^xsd:string. > >>>>> > >>>>>Recommending that stores canonicalize to "foo" would be one way to > >>>>>accomplish this, but only for new data. (And even then, is only a > >>>>>recommendation.) If we changed (or made a SHOULD-style change) literal > >>>>>equality, then the above query would match against :s :p "foo"^^xsd:string > >>>>>as well as :s :p "foo", which -- for me -- is the goal of this issue. > >>>>> > >>>> > >>>>Well, have SPARQL decide that the appropriate entailment is > >>>>{xsd:string}-entailment (that is, D-entailment where D={xsd:string}), and > >>>>that fixes the necessary matching. Seems to me that this is not RDF > >>>>business, in fact. RDF already provides the machinery for doing this, all > >>>>SPARQL has to do is use the existing RDF specs appropriately. > >>>> > >>> > >>>Then maybe I don't understand the original motivation behind ISSUE-12 in > >>>this working group at all. > >>> > >>>*shrug* > >>> > >>> > >>> From what I can tell based on looking at the charter, the original > >>motivation was exactly what you stated: to make querying for string data > >>simpler in SPARQL. > >> > >>Unfortunately, the only ways I can see of making that work transparently in > >>SPARQL are: > >>1. Follow Pat's suggestion and define SPARQL BGP matching in terms of > >>{xsd:string}-entailment. > >>2. Modify the abstract syntax specified in RDF Concepts so that there's only > >>one way of expressing string data in an RDF literal, which seems to be what > >>you're asking for. > > > >3. Add a little text saying that plain literals are preferred to > >literals of type xsd:string. > > > >The RDB2RDF WG faced this in defining the Direct Mapping of relational > >databases to RDF. The ISO SQL committee provides a mapping of SQL > >types to XSD types, and naturally SQL's string types (STRING, CHAR(n), > >VARCHAR(n)) map to xsd:string. Because we didn't want to needlessly > >encumber users with a typed literal when a plain literal would do, we > >overrode the mapping for strings (ints, etc. still map per ISO). A > >little guidance text could encourage others to do the same and > >unification will get that much easier. > > This isn't a new suggestion; this is apparently what this WG is > already doing. It's also what I (and Alex) are saying seems like not > very effective. And what I'm saying is potentially not worth the > time. I'm not sure how you're measuring effectiveness, but if gentle steering is insufficient, you apparently want something mandatory. I read all entailments as optional so I guess you want to say something like: [[ A literal in an RDF graph contains one or two named components. All literals have a lexical form being a Unicode [UNICODE] string, which SHOULD be in Normal Form C [NFC]. Plain literals have a lexical form and optionally a language tag as defined by [RFC-3066], normalized to lowercase. Typed literals have a lexical form and a datatype URI being an RDF URI reference. + Note: There are no typed literals with the datatype <http://www.w3.org/2001/XMLSchema#string>; any strings should be represented as plain literals. ]] — http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-literal I think that if we allow both forms to co-exist but be considered equivalent, implementations are going to have a hard time not surprising users by swapping representations. > Lee > > > > >>I'm not fundamentally opposed to either of those approaches, but they both > >>would require significant changes to deployed code. Given a choice, I would > >>go with the second one because I don't think the problem is confined to > >>SPARQL. I personally think that making a breaking change to the abstract > >>syntax would be worthwhile in this case because string data is so pervasive, > >>but I wouldn't be surprised if there's backlash from the community over > >>that. > >> > >>The proposed resolution for ISSUE-12 appears to me to be avoiding making any > >>breaking changes by recommending that data producers prefer one form > >>syntactic form over another. I share your skepticism over how well that > >>will work in the long run. > >> > >>-Alex > >> > >> > >> > >>>Lee > >>> > >>> > >>> > >>>>Pat > >>>> > >>>> > >>>>>(SPARQL defines matching based on subgraphs, which in terms is based on > >>>>>RDF graph equivalence.) > >>>>> > >>>>>I'm not an expert on the RDF standards documents, admittedly, so I might > >>>>>be missing something. > >>>>> > >>>>>thanks, > >>>>>Lee > >>>>> > >>>>>On 5/4/2011 6:04 AM, Antoine Zimmermann wrote: > >>>>> > >>>>>>Hi, > >>>>>> > >>>>>> > >>>>>>With respect to ISSUE-12, I propose that we reformulate the resolution > >>>>>>as follows: > >>>>>> > >>>>>>"PROPOSED: Recommend that data publishers use plain literals instead of > >>>>>>xs:string typed literals and tell systems to silently convert xs:string > >>>>>>literals to plain literals without language tag." > >>>>>> > >>>>>>In the text of the spec, we may want to add some more details, saying: > >>>>>> > >>>>>>"In XSD-interpretations, any xs:string-typed literal "aaa"^^xs:string is > >>>>>>interpreted as the character string "aaa", that is, it is the same as > >>>>>>the plain literal "aaa". Thus, to ensure a canonical form of character > >>>>>>strings and better interoperability, we recommend that data publishers > >>>>>>always use plain literals instead of xs:string typed literals and tell > >>>>>>systems to silently convert xs:string literals to plain literals without > >>>>>>language tag whenever they occur in an RDF graph." > >>>>>> > >>>>>> > >>>>>> > >>>>>>Regards, > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>------------------------------------------------------------ > >>>>IHMC (850)434 8903 or (650)494 3973 > >>>>40 South Alcaniz St. (850)202 4416 office > >>>>Pensacola (850)202 4440 fax > >>>>FL 32502 (850)291 0667 mobile > >>>>phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>> > > -- -ericP
Received on Wednesday, 4 May 2011 20:01:56 UTC