- From: Alex Hall <alexhall@revelytix.com>
- Date: Fri, 13 May 2011 10:33:41 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: Richard Cyganiak <richard@cyganiak.de>, RDF Working Group WG <public-rdf-wg@w3.org>
- Message-ID: <BANLkTimOPpdZ2gs0vjwq5csPH=mskyCzAg@mail.gmail.com>
On Thu, May 12, 2011 at 9:40 PM, Pat Hayes <phayes@ihmc.us> wrote: > > On May 12, 2011, at 12:06 PM, Richard Cyganiak wrote: > > > On 12 May 2011, at 16:52, Pat Hayes wrote: > >> I agree with all of this (though I think we could maybe be harsher on > xsd:string) but suggest we should additionally explicitly endorse the idea > that plain literals are understood as typed with the datatype > rdf:PlainLiteral, so that all RDF literals are considered to have a type. > And that this should be stated explicitly in Concepts and Semantics, and > built into the RDF entailment regime (along with rdf:XMLLIteral). > > > > Can you explain the mechanism that you have in mind when you say "plain > literals are understood as typed with the datatype rdf:PlainLiteral"? > > > > "foo"@en is a plain literal. > > > > What datatype does it have? None, or rdf:PlainLiteral? > > rdf:PlainLIteral. The idea behind rdf:PlainLIteral, as I understand it, is > that *all* RDF literals have a datatype, even plain ones. Otherwise, there > really is no point to having it around. > > > > > What is its lexical form? "foo" or "foo@en"? > > "foo@en" > > is (unfortunately) the only possible answer. The awkward case, which you > didn't ask, is that lexical form of the plain literal "foo" is "foo@". The > final '@' signals the lack of a language tag (or, if we prefer, the empty > language tag.) > > Put this all another way, the RDF plain literal surface forms "foo" and > "foo"@en are treated as sugared syntax for the real underlying forms "foo@"^^rdf:PlainLIteral > and "foo@en"^^rdf:PlainLIteral. The semantics treats the former as though > they were written like the latter, with the datatype mapping "sss@" --> > "sss" and "sss@ttt" --> <"sss", 'ttt'>. > It's for this reason that I'd prefer to keep rdf:PlainLiteral out of the core RDF specs and reserve it for exchanging language-tagged literals with systems that don't support that notion. Having to deal with the extraneous '@' for literals without language tags seems like needless complexity for what should be a simple string manipulation. If we're going to say that everything has a datatype, I'd prefer to see "foo" get normalized to "foo"^^xsd:string. But my reasons there are more aesthetic; it just seems wrong to single out that one particular primitive datatype and say that it should not be used. FWIW, my preferred approach would be to: 1. Say that every literal has *either* a datatype *or* a language tag. 2. Say that the datatype of the surface form "foo" is xsd:string. I also recognize that I seem to be in the minority on this one. As long as the surface forms "foo" and "foo"^^xsd:string get normalized to the same thing (or systems have permission to do such normalization) then I'm happy. -Alex > > >> I would suggest one more extension, an additional datatype > rdf:PlainLIteralString, which is also built into basic RDF. This is similar > to PlainLIteral but ignores the language tag, so it treats "foo"+EN as equal > to "foo". This would help the users that Andy mentioned who want to ignore > language tags in queries. We can build this into the basic RDF entailment > regime along with PlainLiteral. > > > > I don't think that this helps the users that Andy mentioned. > > Andy seems to agree. Well in that case, forget the idea. > > Pat > > > > The problem is that "foo" != "foo"@en in SPARQL, and this confuses people > who have not wrapped their head around the idea that strings in SPARQL can > have this extra bit called a language tag attached. Introducing a new string > data type doesn't change anything about this situation. > > > > Best, > > Richard > > > > > > > >> > >> These two datatypes are unique in that they apply to plain literal > syntax, which is a good 'theoretical' reason to include them in the RDF > layer of the specs in any case. > >> > >> Pat > >> > >> On May 11, 2011, at 4:23 PM, Richard Cyganiak wrote: > >> > >>> I took an action today to draft text for RDF Concepts that resolves > ISSUE-12. I put it on the wiki here: > >>> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal > >>> A plain text copy is attached below. > >>> > >>> Best, > >>> Richard > >>> > >>> > >>> > >>> SHORT SUMMARY > >>> > >>> 1. RDF Concepts puts more emphasis on the distinction between > (syntactic) “literal equality” and (semantic, important for applications) > “value equality” > >>> 2. RDF Concepts explicitly points out the specific string value > equalities that already arise from RDF Semantics > >>> 3. RDF Concepts declares one of the string literal forms as canonical > >>> 4. Implementations MAY canonicalize, but don't have to > >>> 5. The canonical form is plain literals. > >>> > >>> > >>> WHY? > >>> > >>> 1. No changes to the abstract syntax required > >>> 2. No changes to any concrete syntax or parser required > >>> 3. No changes to any implementations of any of the existing entailment > regimes required > >>> 4. Those who are ok with canonicalization can do that, and don't need > to deal with entailment > >>> 5. Those who don't want to canonicalize, have the option of supporting > only string value equality at query time, without RDFS- and D-Entailment > >>> 6. “MAY canonicalize” softly discourages the use of xsd:string typed > literals, without abolishing them outright or declaring them archaic > >>> 7. Standardizing on xsd:string was never an option because of language > tags > >>> 8. Standardizing on rdf:PlainLiteral was never an option because it > MUST NOT be used in serializations that support plain literals > >>> > >>> > >>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal > >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value > >>> > >>> > >>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal” and move > it ahead of 6.5.1 > >>> > >>> §2 Add to the beginning: > >>> “The value of a plain literal without language tag is the same Unicode > string as its lexical form. > >>> > >>> The value of a plain literal with language tag is a pair consisting of > 1. the same Unicode string as its lexical form, and 2. its language tag. > >>> > >>> For typed literals, …” (continue with rest of section as is) > >>> > >>> §3 Remove the Note at the end of the section > >>> > >>> > >>> CHANGES TO 6.5.1 Literal Equality > >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality > >>> > >>> > >>> §4 Rename section to “6.5.2 Literal Equality and Canonical Forms” > >>> > >>> §5 Add to the beginning: > >>> “Equality of literals can be evaluated based on their syntax, or based > on their value.” > >>> > >>> §6 Change “Two literals are equal …” to: “Two literals are > syntactically equal …” in the current first paragraph. > >>> > >>> §7 Add to the end: > >>> “In application contexts, comparing the values of literals (see section > 6.5.1) is usually more helpful than comparing their syntactic forms. > Literals with different lexical forms and with different datatypes can have > the same value. In particular: > >>> > >>> - A plain literal with lexical form aaa and no language tag has the > same value as a typed literal with lexical form aaa and datatype IRI > xsd:string > >>> - A plain literal with lexical form aaa and no language tag has the > same value as a typed literal with lexical form aaa@ and datatype IRI > rdf:PlainLiteral > >>> - A plain literal with lexical form aaa and language tag xx has the > same value as a typed literal with lexical form aaa@xx and datatype IRI > rdf:PlainLiteral” > >>> > >>> §8 “Some literals are canonical forms. Implementations MAY replace any > literal with a canonical form if both are syntactically different, but have > the same value. All plain literals, with or without language tag, are > canonical forms.” > >>> > >>> > >>> CHANGES TO 6.3 Graph Equivalence > >>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality > >>> > >>> > >>> §9 Append this leftover sentence, which was removed from 6.5.1: > >>> “Note: For comparing RDF Graphs, semantic notions of entailment (see > [RDF-SEMANTICS]) are usually more helpful than the syntactic equivalence > defined here.” > >>> > >>> > >>> EXTENDING THIS TO NUMERIC LITERALS??? > >>> > >>> (While we're at it, we might also cover equalities between the built-in > numeric XSD types, and between different lexical forms of the same built-in > XSD datatype.) > >>> > >> > >> ------------------------------------------------------------ > >> IHMC (850)434 8903 or (650)494 3973 > >> 40 South Alcaniz St. (850)202 4416 office > >> Pensacola (850)202 4440 fax > >> FL 32502 (850)291 0667 mobile > >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > >> > >> > >> > >> > >> > >> > > > > > > > > ------------------------------------------------------------ > IHMC (850)434 8903 or (650)494 3973 > 40 South Alcaniz St. (850)202 4416 office > Pensacola (850)202 4440 fax > FL 32502 (850)291 0667 mobile > phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes > > > > > > >
Received on Friday, 13 May 2011 14:34:09 UTC