- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Wed, 11 Jan 2017 19:00:47 +0000
- To: Stian Soiland-Reyes <soiland-reyes@manchester.ac.uk>
- Cc: public-rdf-comments@w3.org
Hi Stian, An answer cannot be determined with 100% certainty from the text. What is clear: - "Hello"@en and "Hello"@EN have the same value - One MAY normalise "Hello"@EN to "Hello"@en - In RDF 2004, "Hello"@en and "Hello"@EN were clearly equal RDF 2004 forced the language tag to be lower-cased in the abstract syntax. Implementations of RDF 2004 often did not do that, but retained the case when storing or transforming RDF, while still treating @en and @EN as equal. My recollection is that we wanted to change the language of the spec to make this behaviour legal. Unfortunately it seems the language came out less clear than it should be. I do not think that there was any intention to make @en and @EN not equal. Best, Richard > On 11 Jan 2017, at 17:47, Stian Soiland-Reyes <soiland-reyes@manchester.ac.uk> wrote: > > This is a comment for RDF 1.1 Concepts > http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/ > >> From https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/#section-Graph-Literal > >> A literal is a language-tagged string if the third element is present. >> Lexical representations of language tags may be converted to lower case. The >> value space of language tags is always in lower case. > > Followed by: > >> Literal term equality: Two literals are term-equal (the same RDF literal) >> if and only if the two lexical forms, the two datatype IRIs, and the >> two language tags (if any) compare equal, character by character. Thus, >> two literals can have the same value without being the same RDF term. >> For example: >> >> "1"^^xs:integer >> "01"^^xs:integer >> >> denote the same value, but are not the same literal RDF terms and are not >> term-equal because their lexical form differs. > > > Could you help me clarify how language tags should be compared for determining > literal term equality? This came up in the Commons RDF discussion in > https://issues.apache.org/jira/browse/COMMONSRDF-51 > > > There are two interpretations as far as I can see: > > a) (Unicode) Character by character > "Hello"@en-us != "Hello"@EN-US != "Hello"@en-US > > b) (Lower case Unicode) Character by character > "Hello"@en == "Hello"@EN == "Hello"@en-US > > > The general interpretation seems to be that because the lexical representations > MAY be converted to lower case, plus the value space is lower case, language > tags should be compared in lower case as in b). > > However the text does say literally "character by character" as in a) > > So I would suggest - if you agree on b) - an amendmend like: > > Literal term equality: Two literals are term-equal (the same RDF literal) > if and only if the two lexical forms, the two datatype IRIs, and the > two language tags (if any) compare equal, character by character > (but language tags must be compared in lower case). > Thus, two literals ... > > > > -- > Stian Soiland-Reyes > http://orcid.org/0000-0001-9842-9718 > > >
Received on Wednesday, 11 January 2017 19:01:18 UTC