- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Tue, 6 Sep 2011 14:37:38 +0100
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: public-rdf-wg@w3.org
Andy, On 5 Sep 2011, at 22:51, Andy Seaborne wrote: > On 19/08/11 14:28, Richard Cyganiak wrote: >> On 19 Aug 2011, at 00:11, Pat Hayes wrote: >>> Option 2. All literals have a type. rdf:LangString is a special >>> datatype whose L2V mapping takes a pair of strings as input and >>> returns a language-tagged pair as output. This mapping is the >>> identity mapping on pairs<string, tag>, just as xsd:String is the >>> identity mapping on single strings. DATATYPE("foo"@en) returns >>> rdf:LangString, following the normal rules for datatyping. >> >> There's also 2b: >> >> All literals have a type. rdf:LangString is a special type, where the >> lexical form is<string,langtag> rather than just a string, and it >> doesn't have an L2V mapping. The value of an rdf:LangString literal >> is the same as the lexical form. DATATYPE("foo"@en) returns >> rdf:LangString, following the normal rules. >> >> (The advantage of 2b versus 2 is that the L2V mechanism can remain >> unchanged. It can remain defined as functions from string to value, >> rather than functions from anything to value as required by 2. In 2, >> the L2V of rdf:LangString is just the trivial identity mapping >> anyways, and resorting to the L2V mapping device just to explain a >> no-op mapping is overkill.) >> >> (2b also makes it easy to re-write the rdf:PlainLiteral spec into a >> spec titled “An L2V mapping for rdf:LangString” that just defines an >> L2V mapping that takes "foo@en" to<"foo","en">, while keeping the >> current restrictions on use of such lexical forms. So I'd hope it >> would be an easier sell to the OWL/RIF WGs.) > > Slight problem: > > STR(?x) returns the lexical form of a literal. The language string is the conventional extension to SPARQL in current deployments. > > If the lexical form is <string,langtag>, then that would be returned. There is also whether you can write > > ???^^rdf:LangString > > c.f. rdf:PlainLiteral. Later in the thread I came around to see that it's better to define it differently: "foo"@en has a lexical form "foo" and a language tag "en". This is how the terminology was used in RDF 2004 and there isn't really any reason to change it. > A solution is to just say in the syntaxes '''the value of "foo"@en is <foo, en>''' > > This leave L2V alone 9it's not used) and answers what happens if you write ???^^rdf:LangString -- it's an ill-defined literal. Yes, this is basically what I'm advocating now. rdf:langString would still *have* an L2V, but it wouldn't be *used* to define its value, just like you say above. The L2V is the empty mapping and the lexical space is empty and the value space is <lex,lang> pairs. Since the lexical space is empty, "anything"^^rdf:langString is going to be ill-typed. This “vestigial” datatype definition for rdf:langString is just to meet the formal definition of datatypes in RDF. If we don't do this, then all the machinery around datatypes-as-classes in RDF Semantics breaks (or so I'm told). > It's also posisble to define STR() specifically for language tagged literals to mean the string part. If you say, “STR() returns the lexical form of a literal” then it should be fine. Summary of proposal: rdf:langString typed literals are completely normal typed literals, except: 1. they have a non-empty language tag besides the lexical form 2. their lexical space is empty 3. their value is not L2V(datatypeIRI)(lexicalForm) but instead a pair <lexicalForm, languageTag> Best, Richard > that stil leaves opne about writing ^^rdf:LangString. > > Andy > > >> >>> option 2: + simplifies literal syntax + removes SPARQL errors + >>> theoretically clean -- requires change to the datatyping model >> >> option 2b: + simplifies literal syntax + removes SPARQL errors + no >> changes to datatyping model -- introduces one exceptional datatype >> that works differently from all others >> >>> If we say that the L2V mapping takes as input all the syntactic >>> 'components' of a literal, rather than forcing these to be all >>> inside one string, then we allow such things as literals with >>> latitude and longitude denoting positions, complex numbers with >>> real and imaginary parts, etc.., without forcing people to invent >>> coding tricks (like the trailing '^' in rdf:PlainLiteral) to >>> artificially map these into a single string. This might be a >>> genuinely useful extension, in other words. >> >> Being able to express lat/long pairs and complex numbers in the >> abstract syntax isn't really if you have no way of writing them down >> in a concrete syntax. So you either still need to squish them into a >> single string, or extend your RDF syntax of choice with additional >> syntactic sugar for expressing that kind of literal. >> >>> We can also quietly deprecate rdf:PlainLiteral along with 8-track >>> tape players. >> >> A major motivation for rdf:PlainLiteral is the desire to >> stick<string,langtag> pairs into a single string, so I'm afraid it >> won't be quite as easy. >> >> Best, Richard >
Received on Tuesday, 6 September 2011 13:38:10 UTC