Re: sh:uniqueLang with upper-/lowercase language from Vladimir Alexiev on 2020-09-09 (public-shacl@w3.org from September 2020)

From: Vladimir Alexiev <vladimir.alexiev@ontotext.com>
Date: Wed, 9 Sep 2020 17:04:10 +0300
To: Håvard Ottestad <hmottestad@gmail.com>
Cc: Public Shacl W3C <public-shacl@w3.org>
Message-ID: <CAMv+wg6fsH5RASnyhg5r18gHCFb5x9s6cF6Yy8ddYmSTu4bhFw@mail.gmail.com>

I think it's clear that sh:uniqueLang (just like SPARQL langMatches) should
treat lang tags case-insensitively.
Canonicalization is nice to have for better readability (and GraphDB now
stores tags canonicalized) but case-insensitivity is crucial.

As for literal canonicalization, unfortunately repositories don't do that
(which is also the reason = and sameTerm can return different result),
so I guess sh:equals and sh:in also don't do it.

On Wed, Sep 9, 2020 at 4:50 PM Håvard Ottestad <hmottestad@gmail.com> wrote:

> Hi,
>
> I’m wondering how sh:uniqueLang is supposed to work with something like
> “label1”@en-gb and “label2”@en-GB.
>
> The SHACL specification doesn’t specify how two language tags should be
> compared to decide if they are the same. RFC 4646 defines some rules for
> canonicalisation, which goes much further than just upper- and lowercase.
>
> https://tools.ietf.org/html/rfc4646#section-4.4
>
> What are people's thoughts on this?
>
> I guess this would also affect something like sh:equals (eg. “01”^^xsd:int
> vs. “1”^^xsd:int).
>
> Cheers,
> Håvard
>

Received on Wednesday, 9 September 2020 14:05:07 UTC