- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Fri, 13 May 2011 10:48:12 -0400
- To: Alex Hall <alexhall@revelytix.com>
- CC: Pat Hayes <phayes@ihmc.us>, Richard Cyganiak <richard@cyganiak.de>, RDF Working Group WG <public-rdf-wg@w3.org>
On 5/13/2011 10:33 AM, Alex Hall wrote: > On Thu, May 12, 2011 at 9:40 PM, Pat Hayes <phayes@ihmc.us > <mailto:phayes@ihmc.us>> wrote: > > > On May 12, 2011, at 12:06 PM, Richard Cyganiak wrote: > > > On 12 May 2011, at 16:52, Pat Hayes wrote: > >> I agree with all of this (though I think we could maybe be > harsher on xsd:string) but suggest we should additionally explicitly > endorse the idea that plain literals are understood as typed with > the datatype rdf:PlainLiteral, so that all RDF literals are > considered to have a type. And that this should be stated explicitly > in Concepts and Semantics, and built into the RDF entailment regime > (along with rdf:XMLLIteral). > > > > Can you explain the mechanism that you have in mind when you say > "plain literals are understood as typed with the datatype > rdf:PlainLiteral"? > > > > "foo"@en is a plain literal. > > > > What datatype does it have? None, or rdf:PlainLiteral? > > rdf:PlainLIteral. The idea behind rdf:PlainLIteral, as I understand > it, is that *all* RDF literals have a datatype, even plain ones. > Otherwise, there really is no point to having it around. > > > > > What is its lexical form? "foo" or "foo@en"? > > "foo@en" > > is (unfortunately) the only possible answer. The awkward case, which > you didn't ask, is that lexical form of the plain literal "foo" is > "foo@". The final '@' signals the lack of a language tag (or, if we > prefer, the empty language tag.) > > Put this all another way, the RDF plain literal surface forms "foo" > and "foo"@en are treated as sugared syntax for the real underlying > forms "foo@"^^rdf:PlainLIteral and "foo@en"^^rdf:PlainLIteral. The > semantics treats the former as though they were written like the > latter, with the datatype mapping "sss@" --> "sss" and "sss@ttt" --> > <"sss", 'ttt'>. > > > It's for this reason that I'd prefer to keep rdf:PlainLiteral out of the > core RDF specs and reserve it for exchanging language-tagged literals > with systems that don't support that notion. Having to deal with the > extraneous '@' for literals without language tags seems like needless > complexity for what should be a simple string manipulation. > > If we're going to say that everything has a datatype, I'd prefer to see > "foo" get normalized to "foo"^^xsd:string. But my reasons there are > more aesthetic; it just seems wrong to single out that one particular > primitive datatype and say that it should not be used. > > FWIW, my preferred approach would be to: > 1. Say that every literal has *either* a datatype *or* a language tag. > 2. Say that the datatype of the surface form "foo" is xsd:string. I also prefer this approach. I don't really understand the preference for normalizing to a plain literal with no datatype or language tag. I know Andy talked about users wanting similarity between language tagged literals and simple string literals, but I don't really even know what wanting that similarity means. Also, note that (as has been mentioned already), the SPARQL datatype(...) function already specifically says datatype("foo") is xsd:string. > I also recognize that I seem to be in the minority on this one. As long > as the surface forms "foo" and "foo"^^xsd:string get normalized to the > same thing (or systems have permission to do such normalization) then > I'm happy. Yes, I can live with this outcome as well. Lee > -Alex > > > >> I would suggest one more extension, an additional datatype > rdf:PlainLIteralString, which is also built into basic RDF. This is > similar to PlainLIteral but ignores the language tag, so it treats > "foo"+EN as equal to "foo". This would help the users that Andy > mentioned who want to ignore language tags in queries. We can build > this into the basic RDF entailment regime along with PlainLiteral. > > > > I don't think that this helps the users that Andy mentioned. > > Andy seems to agree. Well in that case, forget the idea. > > Pat > > > > The problem is that "foo" != "foo"@en in SPARQL, and this > confuses people who have not wrapped their head around the idea that > strings in SPARQL can have this extra bit called a language tag > attached. Introducing a new string data type doesn't change anything > about this situation. > > > > Best, > > Richard > > > > > > > >> > >> These two datatypes are unique in that they apply to plain > literal syntax, which is a good 'theoretical' reason to include them > in the RDF layer of the specs in any case. > >> > >> Pat > >> > >> On May 11, 2011, at 4:23 PM, Richard Cyganiak wrote: > >> > >>> I took an action today to draft text for RDF Concepts that > resolves ISSUE-12. I put it on the wiki here: > >>> > http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal > >>> A plain text copy is attached below. > >>> > >>> Best, > >>> Richard > >>> > >>> > >>> > >>> SHORT SUMMARY > >>> > >>> 1. RDF Concepts puts more emphasis on the distinction between > (syntactic) “literal equality” and (semantic, important for > applications) “value equality” > >>> 2. RDF Concepts explicitly points out the specific string value > equalities that already arise from RDF Semantics > >>> 3. RDF Concepts declares one of the string literal forms as > canonical > >>> 4. Implementations MAY canonicalize, but don't have to > >>> 5. The canonical form is plain literals. > >>> > >>> > >>> WHY? > >>> > >>> 1. No changes to the abstract syntax required > >>> 2. No changes to any concrete syntax or parser required > >>> 3. No changes to any implementations of any of the existing > entailment regimes required > >>> 4. Those who are ok with canonicalization can do that, and > don't need to deal with entailment > >>> 5. Those who don't want to canonicalize, have the option of > supporting only string value equality at query time, without RDFS- > and D-Entailment > >>> 6. “MAY canonicalize” softly discourages the use of xsd:string > typed literals, without abolishing them outright or declaring them > archaic > >>> 7. Standardizing on xsd:string was never an option because of > language tags > >>> 8. Standardizing on rdf:PlainLiteral was never an option > because it MUST NOT be used in serializations that support plain > literals > >>> > >>> > >>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal > >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value > >>> > >>> > >>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” > and move it ahead of 6.5.1 > >>> > >>> §2 Add to the beginning: > >>> “The value of a plain literal without language tag is the same > Unicode string as its lexical form. > >>> > >>> The value of a plain literal with language tag is a pair > consisting of 1. the same Unicode string as its lexical form, and 2. > its language tag. > >>> > >>> For typed literals, …” (continue with rest of section as is) > >>> > >>> §3 Remove the Note at the end of the section > >>> > >>> > >>> CHANGES TO 6.5.1 Literal Equality > >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality > >>> > >>> > >>> §4 Rename section to “6.5.2 Literal Equality and Canonical Forms” > >>> > >>> §5 Add to the beginning: > >>> “Equality of literals can be evaluated based on their syntax, > or based on their value.” > >>> > >>> §6 Change “Two literals are equal …” to: “Two literals are > syntactically equal …” in the current first paragraph. > >>> > >>> §7 Add to the end: > >>> “In application contexts, comparing the values of literals (see > section 6.5.1) is usually more helpful than comparing their > syntactic forms. Literals with different lexical forms and with > different datatypes can have the same value. In particular: > >>> > >>> - A plain literal with lexical form aaa and no language tag has > the same value as a typed literal with lexical form aaa and datatype > IRI xsd:string > >>> - A plain literal with lexical form aaa and no language tag has > the same value as a typed literal with lexical form aaa@ and > datatype IRI rdf:PlainLiteral > >>> - A plain literal with lexical form aaa and language tag xx has > the same value as a typed literal with lexical form aaa@xx and > datatype IRI rdf:PlainLiteral” > >>> > >>> §8 “Some literals are canonical forms. Implementations MAY > replace any literal with a canonical form if both are syntactically > different, but have the same value. All plain literals, with or > without language tag, are canonical forms.” > >>> > >>> > >>> CHANGES TO 6.3 Graph Equivalence > >>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality > >>> > >>> > >>> §9 Append this leftover sentence, which was removed from 6.5.1: > >>> “Note: For comparing RDF Graphs, semantic notions of entailment > (see [RDF-SEMANTICS]) are usually more helpful than the syntactic > equivalence defined here.” > >>> > >>> > >>> EXTENDING THIS TO NUMERIC LITERALS??? > >>> > >>> (While we're at it, we might also cover equalities between the > built-in numeric XSD types, and between different lexical forms of > the same built-in XSD datatype.) > >>> > >> > >> ------------------------------------------------------------ > >> IHMC (850)434 8903 or > (650)494 3973 > >> 40 South Alcaniz St. (850)202 4416 office > >> Pensacola (850)202 4440 fax > >> FL 32502 (850)291 0667 mobile > >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > >> > >> > >> > >> > >> > >> > > > > > > > > ------------------------------------------------------------ > IHMC (850)434 8903 or (650)494 3973 > 40 South Alcaniz St. (850)202 4416 office > Pensacola (850)202 4440 fax > FL 32502 (850)291 0667 mobile > phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > > > >
Received on Friday, 13 May 2011 14:48:35 UTC