Re: sh:uniqueLang with upper-/lowercase language from Håvard Ottestad on 2020-09-09 (public-shacl@w3.org from September 2020)

From: Håvard Ottestad <hmottestad@gmail.com>
Date: Wed, 9 Sep 2020 23:26:09 +0200
To: Public Shacl W3C <public-shacl@w3.org>
Message-Id: <FB093054-4A27-46F5-BF9A-4354D59C4BE5@gmail.com>

Hi,

A follow up on this. The SHACL spec has defined “language tags" to follow BCP 47 (in section 1.1 of the SHACL spec). And BCP 47 says that case does not carry a meaning. 

https://tools.ietf.org/html/bcp47#section-2.1.1 <https://tools.ietf.org/html/bcp47#section-2.1.1>

Cheers,
Håvard

> On 9 Sep 2020, at 16:04, Vladimir Alexiev <vladimir.alexiev@ontotext.com> wrote:
> 
> I think it's clear that sh:uniqueLang (just like SPARQL langMatches) should treat lang tags case-insensitively.
> Canonicalization is nice to have for better readability (and GraphDB now stores tags canonicalized) but case-insensitivity is crucial.
> 
> As for literal canonicalization, unfortunately repositories don't do that (which is also the reason = and sameTerm can return different result),
> so I guess sh:equals and sh:in also don't do it.
> 
> On Wed, Sep 9, 2020 at 4:50 PM Håvard Ottestad <hmottestad@gmail.com <mailto:hmottestad@gmail.com>> wrote:
> Hi,
> 
> I’m wondering how sh:uniqueLang is supposed to work with something like “label1”@en-gb and “label2”@en-GB. 
> 
> The SHACL specification doesn’t specify how two language tags should be compared to decide if they are the same. RFC 4646 defines some rules for canonicalisation, which goes much further than just upper- and lowercase. 
> 
> https://tools.ietf.org/html/rfc4646#section-4.4 <https://tools.ietf.org/html/rfc4646#section-4.4>
> 
> What are people's thoughts on this? 
> 
> I guess this would also affect something like sh:equals (eg. “01”^^xsd:int vs. “1”^^xsd:int).
> 
> Cheers,
> Håvard

Received on Wednesday, 9 September 2020 21:26:25 UTC