- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Fri, 13 May 2011 10:48:12 -0400
- To: Alex Hall <alexhall@revelytix.com>
- CC: Pat Hayes <phayes@ihmc.us>, Richard Cyganiak <richard@cyganiak.de>, RDF Working Group WG <public-rdf-wg@w3.org>
On 5/13/2011 10:33 AM, Alex Hall wrote:
> On Thu, May 12, 2011 at 9:40 PM, Pat Hayes <phayes@ihmc.us
> <mailto:phayes@ihmc.us>> wrote:
>
>
> On May 12, 2011, at 12:06 PM, Richard Cyganiak wrote:
>
> > On 12 May 2011, at 16:52, Pat Hayes wrote:
> >> I agree with all of this (though I think we could maybe be
> harsher on xsd:string) but suggest we should additionally explicitly
> endorse the idea that plain literals are understood as typed with
> the datatype rdf:PlainLiteral, so that all RDF literals are
> considered to have a type. And that this should be stated explicitly
> in Concepts and Semantics, and built into the RDF entailment regime
> (along with rdf:XMLLIteral).
> >
> > Can you explain the mechanism that you have in mind when you say
> "plain literals are understood as typed with the datatype
> rdf:PlainLiteral"?
> >
> > "foo"@en is a plain literal.
> >
> > What datatype does it have? None, or rdf:PlainLiteral?
>
> rdf:PlainLIteral. The idea behind rdf:PlainLIteral, as I understand
> it, is that *all* RDF literals have a datatype, even plain ones.
> Otherwise, there really is no point to having it around.
>
> >
> > What is its lexical form? "foo" or "foo@en"?
>
> "foo@en"
>
> is (unfortunately) the only possible answer. The awkward case, which
> you didn't ask, is that lexical form of the plain literal "foo" is
> "foo@". The final '@' signals the lack of a language tag (or, if we
> prefer, the empty language tag.)
>
> Put this all another way, the RDF plain literal surface forms "foo"
> and "foo"@en are treated as sugared syntax for the real underlying
> forms "foo@"^^rdf:PlainLIteral and "foo@en"^^rdf:PlainLIteral. The
> semantics treats the former as though they were written like the
> latter, with the datatype mapping "sss@" --> "sss" and "sss@ttt" -->
> <"sss", 'ttt'>.
>
>
> It's for this reason that I'd prefer to keep rdf:PlainLiteral out of the
> core RDF specs and reserve it for exchanging language-tagged literals
> with systems that don't support that notion. Having to deal with the
> extraneous '@' for literals without language tags seems like needless
> complexity for what should be a simple string manipulation.
>
> If we're going to say that everything has a datatype, I'd prefer to see
> "foo" get normalized to "foo"^^xsd:string. But my reasons there are
> more aesthetic; it just seems wrong to single out that one particular
> primitive datatype and say that it should not be used.
>
> FWIW, my preferred approach would be to:
> 1. Say that every literal has *either* a datatype *or* a language tag.
> 2. Say that the datatype of the surface form "foo" is xsd:string.
I also prefer this approach. I don't really understand the preference
for normalizing to a plain literal with no datatype or language tag. I
know Andy talked about users wanting similarity between language tagged
literals and simple string literals, but I don't really even know what
wanting that similarity means.
Also, note that (as has been mentioned already), the SPARQL
datatype(...) function already specifically says datatype("foo") is
xsd:string.
> I also recognize that I seem to be in the minority on this one. As long
> as the surface forms "foo" and "foo"^^xsd:string get normalized to the
> same thing (or systems have permission to do such normalization) then
> I'm happy.
Yes, I can live with this outcome as well.
Lee
> -Alex
>
>
> >> I would suggest one more extension, an additional datatype
> rdf:PlainLIteralString, which is also built into basic RDF. This is
> similar to PlainLIteral but ignores the language tag, so it treats
> "foo"+EN as equal to "foo". This would help the users that Andy
> mentioned who want to ignore language tags in queries. We can build
> this into the basic RDF entailment regime along with PlainLiteral.
> >
> > I don't think that this helps the users that Andy mentioned.
>
> Andy seems to agree. Well in that case, forget the idea.
>
> Pat
>
>
> > The problem is that "foo" != "foo"@en in SPARQL, and this
> confuses people who have not wrapped their head around the idea that
> strings in SPARQL can have this extra bit called a language tag
> attached. Introducing a new string data type doesn't change anything
> about this situation.
> >
> > Best,
> > Richard
> >
> >
> >
> >>
> >> These two datatypes are unique in that they apply to plain
> literal syntax, which is a good 'theoretical' reason to include them
> in the RDF layer of the specs in any case.
> >>
> >> Pat
> >>
> >> On May 11, 2011, at 4:23 PM, Richard Cyganiak wrote:
> >>
> >>> I took an action today to draft text for RDF Concepts that
> resolves ISSUE-12. I put it on the wiki here:
> >>>
> http://www.w3.org/2011/rdf-wg/wiki/StringLiterals/EntailmentProposal
> >>> A plain text copy is attached below.
> >>>
> >>> Best,
> >>> Richard
> >>>
> >>>
> >>>
> >>> SHORT SUMMARY
> >>>
> >>> 1. RDF Concepts puts more emphasis on the distinction between
> (syntactic) “literal equality” and (semantic, important for
> applications) “value equality”
> >>> 2. RDF Concepts explicitly points out the specific string value
> equalities that already arise from RDF Semantics
> >>> 3. RDF Concepts declares one of the string literal forms as
> canonical
> >>> 4. Implementations MAY canonicalize, but don't have to
> >>> 5. The canonical form is plain literals.
> >>>
> >>>
> >>> WHY?
> >>>
> >>> 1. No changes to the abstract syntax required
> >>> 2. No changes to any concrete syntax or parser required
> >>> 3. No changes to any implementations of any of the existing
> entailment regimes required
> >>> 4. Those who are ok with canonicalization can do that, and
> don't need to deal with entailment
> >>> 5. Those who don't want to canonicalize, have the option of
> supporting only string value equality at query time, without RDFS-
> and D-Entailment
> >>> 6. “MAY canonicalize” softly discourages the use of xsd:string
> typed literals, without abolishing them outright or declaring them
> archaic
> >>> 7. Standardizing on xsd:string was never an option because of
> language tags
> >>> 8. Standardizing on rdf:PlainLiteral was never an option
> because it MUST NOT be used in serializations that support plain
> literals
> >>>
> >>>
> >>> CHANGES TO 6.5.2 The Value Corresponding to a Typed Literal
> >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Value
> >>>
> >>>
> >>> §1 Rename it to “6.5.1 The Value Corresponding to a Literal”
> and move it ahead of 6.5.1
> >>>
> >>> §2 Add to the beginning:
> >>> “The value of a plain literal without language tag is the same
> Unicode string as its lexical form.
> >>>
> >>> The value of a plain literal with language tag is a pair
> consisting of 1. the same Unicode string as its lexical form, and 2.
> its language tag.
> >>>
> >>> For typed literals, …” (continue with rest of section as is)
> >>>
> >>> §3 Remove the Note at the end of the section
> >>>
> >>>
> >>> CHANGES TO 6.5.1 Literal Equality
> >>> http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality
> >>>
> >>>
> >>> §4 Rename section to “6.5.2 Literal Equality and Canonical Forms”
> >>>
> >>> §5 Add to the beginning:
> >>> “Equality of literals can be evaluated based on their syntax,
> or based on their value.”
> >>>
> >>> §6 Change “Two literals are equal …” to: “Two literals are
> syntactically equal …” in the current first paragraph.
> >>>
> >>> §7 Add to the end:
> >>> “In application contexts, comparing the values of literals (see
> section 6.5.1) is usually more helpful than comparing their
> syntactic forms. Literals with different lexical forms and with
> different datatypes can have the same value. In particular:
> >>>
> >>> - A plain literal with lexical form aaa and no language tag has
> the same value as a typed literal with lexical form aaa and datatype
> IRI xsd:string
> >>> - A plain literal with lexical form aaa and no language tag has
> the same value as a typed literal with lexical form aaa@ and
> datatype IRI rdf:PlainLiteral
> >>> - A plain literal with lexical form aaa and language tag xx has
> the same value as a typed literal with lexical form aaa@xx and
> datatype IRI rdf:PlainLiteral”
> >>>
> >>> §8 “Some literals are canonical forms. Implementations MAY
> replace any literal with a canonical form if both are syntactically
> different, but have the same value. All plain literals, with or
> without language tag, are canonical forms.”
> >>>
> >>>
> >>> CHANGES TO 6.3 Graph Equivalence
> >>> http://www.w3.org/TR/rdf-concepts/#section-graph-equality
> >>>
> >>>
> >>> §9 Append this leftover sentence, which was removed from 6.5.1:
> >>> “Note: For comparing RDF Graphs, semantic notions of entailment
> (see [RDF-SEMANTICS]) are usually more helpful than the syntactic
> equivalence defined here.”
> >>>
> >>>
> >>> EXTENDING THIS TO NUMERIC LITERALS???
> >>>
> >>> (While we're at it, we might also cover equalities between the
> built-in numeric XSD types, and between different lexical forms of
> the same built-in XSD datatype.)
> >>>
> >>
> >> ------------------------------------------------------------
> >> IHMC (850)434 8903 or
> (650)494 3973
> >> 40 South Alcaniz St. (850)202 4416 office
> >> Pensacola (850)202 4440 fax
> >> FL 32502 (850)291 0667 mobile
> >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
>
> ------------------------------------------------------------
> IHMC (850)434 8903 or (650)494 3973
> 40 South Alcaniz St. (850)202 4416 office
> Pensacola (850)202 4440 fax
> FL 32502 (850)291 0667 mobile
> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>
Received on Friday, 13 May 2011 14:48:35 UTC