- From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
- Date: Wed, 18 May 2011 11:07:10 +0200
- To: Steve Harris <steve.harris@garlik.com>
- CC: RDF Working Group WG <public-rdf-wg@w3.org>
On 05/18/2011 10:37 AM, Steve Harris wrote: > On 2011-05-17, at 21:01, Pierre-Antoine Champin wrote: > >> sorry, some second thoughts >> >> On 05/17/2011 09:03 PM, Pierre-Antoine Champin wrote: >>> On 05/17/2011 11:06 AM, Steve Harris wrote: >> <snip/> >>>> So, I'm guessing as a formulation that rdflang:en would be a subtype >>>> of xsd:string, >> >> as far as I understand, currently "chat"^^xsd:string ≠ "chat"@en and >> "chat" ≠ "chat"@en, and more generally no xsd:string or simple literal >> is equal to a plain literal with language tag. So the respective >> datatypes should have disjoint value spaces, hence no subtype relation. >> >>>> and rdflang:en-GB would be a subtype of rdflang:en, and >>>> so on? >> >> I'm not even sure "en-GB" is a valid language tag, reading [1]: > > It's a region subtag, see http://www.w3.org/International/articles/language-tags/ > So, yes it is a valid language tag. I meant "valid language tag *in RDF*" of course. But I guess the URL you refer to can apply to RDF as well (as language tags in RDF are obviously inherited from xml:lang). >> Note: When using the language tag, care must be taken not to confuse >> language with locale. The language tag relates only to human language >> text. Presentational issues should be addressed in end-user >> applications. >> >> [1] >> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-Literal >> >> but if it is, literals with @en-GB" are disjoint from literals with @en >> and so the respective datatypes should be disjoint as well. > > It's not quite that simple. @en matches @en-GB, but they're not equal c.f. > http://www.w3.org/International/articles/language-tags/#matching > and http://www.w3.org/TR/rdf-sparql-query/#func-langMatches Do they *match* in the sense of the model theory? In other words, does :a :b "chat"@en-GB . entail :a :b "chat"@en . in any entailment regime defined by the RDF semantics ?? I don't think so, which does not mean that it is not an interesting thing to consider —although it looks like a tricky can of worms... In any case, I don't think that this entailment would mean that rdflang:en would be a supertype of rdflang:en-GB, as their value space would still be disjoint, in my view. pa > >>>> A few practical considerations: >>>> >>>> 1) ISO language codes are not case sensitive, IRIs are. "foo"@fr = >>>> "foo"@FR, "foo"^^rdflang:fr != "foo"^^rdflang:FR. We'd need to define a >>>> canonical case for the datatype form. >>> >>> I hadn't thought of that either, but yes, canonical case sounds like the >>> right thing to do. >> >> and according to [1] again, the language tag is normalized to lowercase >> in the abstract syntax. > > OK, that's easy. > >> <snip /> >>>> 4) Is the value space all UTF-8 strings? If not, is it a type error >>>> to write "מחשב"^^rdflang:en? >>> >>> well, currently I guess any UTF-8 string is valid. So yes, the value >>> space would of all those datatypes would be all UTF-8 strings, if only >>> for the sake of BC (and because I sure don't want to walk down that path...) >> >> sorry, I was reading "lexical space". >> >> The value space would be isomorphic to the set of UTF-8 strings, but >> different for each "language datatype". Defining it as the set of pair >> <text, language-tag> as in RDF Semantics seems like a good option. > > Sounds reasonable. > > - Steve >
Received on Wednesday, 18 May 2011 09:07:35 UTC