- From: Steve Harris <steve.harris@garlik.com>
- Date: Wed, 18 May 2011 09:37:47 +0100
- To: Pierre-Antoine Champin <pierre-antoine@champin.net>
- Cc: RDF Working Group WG <public-rdf-wg@w3.org>
On 2011-05-17, at 21:01, Pierre-Antoine Champin wrote: > sorry, some second thoughts > > On 05/17/2011 09:03 PM, Pierre-Antoine Champin wrote: >> On 05/17/2011 11:06 AM, Steve Harris wrote: > <snip/> >>> So, I'm guessing as a formulation that rdflang:en would be a subtype >>> of xsd:string, > > as far as I understand, currently "chat"^^xsd:string ≠ "chat"@en and > "chat" ≠ "chat"@en, and more generally no xsd:string or simple literal > is equal to a plain literal with language tag. So the respective > datatypes should have disjoint value spaces, hence no subtype relation. > >>> and rdflang:en-GB would be a subtype of rdflang:en, and >>> so on? > > I'm not even sure "en-GB" is a valid language tag, reading [1]: It's a region subtag, see http://www.w3.org/International/articles/language-tags/ So, yes it is a valid language tag. > Note: When using the language tag, care must be taken not to confuse > language with locale. The language tag relates only to human language > text. Presentational issues should be addressed in end-user > applications. > > [1] > http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-Literal > > but if it is, literals with @en-GB" are disjoint from literals with @en > and so the respective datatypes should be disjoint as well. It's not quite that simple. @en matches @en-GB, but they're not equal c.f. http://www.w3.org/International/articles/language-tags/#matching and http://www.w3.org/TR/rdf-sparql-query/#func-langMatches >>> A few practical considerations: >>> >>> 1) ISO language codes are not case sensitive, IRIs are. "foo"@fr = >>> "foo"@FR, "foo"^^rdflang:fr != "foo"^^rdflang:FR. We'd need to define a >>> canonical case for the datatype form. >> >> I hadn't thought of that either, but yes, canonical case sounds like the >> right thing to do. > > and according to [1] again, the language tag is normalized to lowercase > in the abstract syntax. OK, that's easy. > <snip /> >>> 4) Is the value space all UTF-8 strings? If not, is it a type error >>> to write "מחשב"^^rdflang:en? >> >> well, currently I guess any UTF-8 string is valid. So yes, the value >> space would of all those datatypes would be all UTF-8 strings, if only >> for the sake of BC (and because I sure don't want to walk down that path...) > > sorry, I was reading "lexical space". > > The value space would be isomorphic to the set of UTF-8 strings, but > different for each "language datatype". Defining it as the set of pair > <text, language-tag> as in RDF Semantics seems like a good option. Sounds reasonable. - Steve -- Steve Harris, CTO, Garlik Limited 1-3 Halford Road, Richmond, TW10 6AW, UK +44 20 8439 8203 http://www.garlik.com/ Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Wednesday, 18 May 2011 08:38:16 UTC