RE: lang from Markus Lanthaler on 2013-12-12 (public-rdf-wg@w3.org from December 2013)

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Thu, 12 Dec 2013 17:29:05 +0100
To: <public-rdf-wg@w3.org>
Message-ID: <004101cef757$47a145d0$d6e3d170$@lanthaler@gmx.net>

On Thursday, December 12, 2013 2:19 PM, Sandro Hawke wrote:
> On 12/12/2013 07:09 AM, Andy Seaborne wrote:
> > On 12/12/13 11:16, Markus Lanthaler wrote:
> >
> >> This is completely off-topic and I'm asking it just out of
> >> curiosity: What would break if we would have decided to 
> >> define a datatype for each language.
> >> So instead of rdf:langString we would have had something like
> >> rdf:lang-xxx
> >> similar to the container membership properties rdf:_xx:
> >>
> >>    <> rdfs:comment "An explanation in English"^^rdf:lang-en
> >>
> >
> > We ended up here, at least in part, because XML has language tag and
> > they are not datatypes.
> >
> > They are case-insensitive matches.
> >
> > "An explanation in English"^^rdf:lang-en
> > owl:sameAs
> > "An explanation in English"^^rdf:lang-EN

Right, that could be dealt with by conformance clauses in the spec and
syntactic shortcuts in concrete serialization formats so which parsers will
convert to those IRIs (after lower-casing the language tags).


> > and
> > RFC4647 "Matching of Language Tags"
> >
> > Language tags can be compared in ways that classes can't
> >
> >  "en-us" lang-matches "en"
> >  "en-us" lang-matches "en-*"
> >
> > so inheritance-ishness in datatypes is needed, but it's not like XSD
> > derived types on the same value space.

Good point. I'm wondering if this would have any practical consequences
though.


> > It seems to me that you end up with some additional machinery at
> > which point the nice model of datatypes is a bit lost.
> >
> > With rdf:langString, the language tag is the "additional machinery" 
> > in a way that does not leak out to other datatypes.

Hmm... wouldn't that "additional machinery" be limited to datatype IRIs
starting with "rdf:lang-"


> All true.
> 
> We do not appear to have kept good records on this issue.  I can't
> figure out which issue number it was (sort of 12...?) and the only
> relevant resolution is on a detail:
> 
> 2013/05/22-rdf-wg RESOLVED:  The value space of rdf:langString has the
> language tag in lower case; in the lexical form, the language tag MAY
> be converted to lower case (as RDF 1.0 says, but not everyone does).
> 
> (Personally I've never liked the current design.   My preference was to
> use predicates ( _:someAbstractMessage)---expressedInEnglish---
> >"...."),

Right, or a variation using two predicates on the blank node (eg. rdf:value
& rdf:language) which would make it possible to express datatypes the same
way (rdf:value and rdf:type, or a similar property).


> or datatypes.     But we settled this a long time ago, even if I can't
> find the resolution.)

I haven't even looked at the records. It's way to late anyway. I was just
curios if there would break anything (mainly semantics-wise).


--
Markus Lanthaler
@markuslanthaler

Received on Thursday, 12 December 2013 16:29:41 UTC