- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 6 Sep 2011 16:00:06 +0200
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-wg@w3.org
I had an action last week to produce a WBS form clearly spelling out the various options, that the group could vote on it. Seeing this discussion I wonder whether we *do* have the options clear for such a vote. If so, I would appreciate if one of you gave them to me... Ivan On Sep 6, 2011, at 15:37 , Richard Cyganiak wrote: > Andy, > > On 5 Sep 2011, at 22:51, Andy Seaborne wrote: >> On 19/08/11 14:28, Richard Cyganiak wrote: >>> On 19 Aug 2011, at 00:11, Pat Hayes wrote: >>>> Option 2. All literals have a type. rdf:LangString is a special >>>> datatype whose L2V mapping takes a pair of strings as input and >>>> returns a language-tagged pair as output. This mapping is the >>>> identity mapping on pairs<string, tag>, just as xsd:String is the >>>> identity mapping on single strings. DATATYPE("foo"@en) returns >>>> rdf:LangString, following the normal rules for datatyping. >>> >>> There's also 2b: >>> >>> All literals have a type. rdf:LangString is a special type, where the >>> lexical form is<string,langtag> rather than just a string, and it >>> doesn't have an L2V mapping. The value of an rdf:LangString literal >>> is the same as the lexical form. DATATYPE("foo"@en) returns >>> rdf:LangString, following the normal rules. >>> >>> (The advantage of 2b versus 2 is that the L2V mechanism can remain >>> unchanged. It can remain defined as functions from string to value, >>> rather than functions from anything to value as required by 2. In 2, >>> the L2V of rdf:LangString is just the trivial identity mapping >>> anyways, and resorting to the L2V mapping device just to explain a >>> no-op mapping is overkill.) >>> >>> (2b also makes it easy to re-write the rdf:PlainLiteral spec into a >>> spec titled “An L2V mapping for rdf:LangString” that just defines an >>> L2V mapping that takes "foo@en" to<"foo","en">, while keeping the >>> current restrictions on use of such lexical forms. So I'd hope it >>> would be an easier sell to the OWL/RIF WGs.) >> >> Slight problem: >> >> STR(?x) returns the lexical form of a literal. The language string is the conventional extension to SPARQL in current deployments. >> >> If the lexical form is <string,langtag>, then that would be returned. There is also whether you can write >> >> ???^^rdf:LangString >> >> c.f. rdf:PlainLiteral. > > Later in the thread I came around to see that it's better to define it differently: "foo"@en has a lexical form "foo" and a language tag "en". This is how the terminology was used in RDF 2004 and there isn't really any reason to change it. > >> A solution is to just say in the syntaxes '''the value of "foo"@en is <foo, en>''' >> >> This leave L2V alone 9it's not used) and answers what happens if you write ???^^rdf:LangString -- it's an ill-defined literal. > > Yes, this is basically what I'm advocating now. rdf:langString would still *have* an L2V, but it wouldn't be *used* to define its value, just like you say above. The L2V is the empty mapping and the lexical space is empty and the value space is <lex,lang> pairs. Since the lexical space is empty, "anything"^^rdf:langString is going to be ill-typed. > > This “vestigial” datatype definition for rdf:langString is just to meet the formal definition of datatypes in RDF. If we don't do this, then all the machinery around datatypes-as-classes in RDF Semantics breaks (or so I'm told). > >> It's also posisble to define STR() specifically for language tagged literals to mean the string part. > > If you say, “STR() returns the lexical form of a literal” then it should be fine. > > Summary of proposal: > > rdf:langString typed literals are completely normal typed literals, except: > 1. they have a non-empty language tag besides the lexical form > 2. their lexical space is empty > 3. their value is not L2V(datatypeIRI)(lexicalForm) but instead a pair <lexicalForm, languageTag> > > Best, > Richard > > >> that stil leaves opne about writing ^^rdf:LangString. >> >> Andy >> >> >>> >>>> option 2: + simplifies literal syntax + removes SPARQL errors + >>>> theoretically clean -- requires change to the datatyping model >>> >>> option 2b: + simplifies literal syntax + removes SPARQL errors + no >>> changes to datatyping model -- introduces one exceptional datatype >>> that works differently from all others >>> >>>> If we say that the L2V mapping takes as input all the syntactic >>>> 'components' of a literal, rather than forcing these to be all >>>> inside one string, then we allow such things as literals with >>>> latitude and longitude denoting positions, complex numbers with >>>> real and imaginary parts, etc.., without forcing people to invent >>>> coding tricks (like the trailing '^' in rdf:PlainLiteral) to >>>> artificially map these into a single string. This might be a >>>> genuinely useful extension, in other words. >>> >>> Being able to express lat/long pairs and complex numbers in the >>> abstract syntax isn't really if you have no way of writing them down >>> in a concrete syntax. So you either still need to squish them into a >>> single string, or extend your RDF syntax of choice with additional >>> syntactic sugar for expressing that kind of literal. >>> >>>> We can also quietly deprecate rdf:PlainLiteral along with 8-track >>>> tape players. >>> >>> A major motivation for rdf:PlainLiteral is the desire to >>> stick<string,langtag> pairs into a single string, so I'm afraid it >>> won't be quite as easy. >>> >>> Best, Richard >> > > ---- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Tuesday, 6 September 2011 14:00:26 UTC