W3C home > Mailing lists > Public > semantic-web@w3.org > December 2016

Re: Clarification about language tag

From: Simon Spero <sesuncedu@gmail.com>
Date: Tue, 20 Dec 2016 17:36:20 -0500
Message-ID: <CADE8KM5SQVrx+djZJ1EJaGZ8024R0iK+utnL6uSmQXDstsYk0w@mail.gmail.com>
To: Mario Valle <mvalle@cscs.ch>
Cc: "semantic-web@w3.org Web" <semantic-web@w3.org>
Interestingly, the wordnet "RDF" file [1] syntactically valid under RDF
1.0, but is syntactically invalid RDF 1.1 (at least, I believe it is a
syntax invalid, as opposed to being inconsistent).

OWL 2 is defined in terms of PlainLiterals, and permits, but does not
require, implementations to reject well-formed but invalid language tags.

Current versions of the OWLAPI are forgiving; the rdf4j RIO based parsers
can be be set to signal an error on invalid tags, but this behavior is
disabled by default.

[1] http://wordnet-rdf.princeton.edu/wn31.nt.gz

On Tue, Dec 20, 2016 at 11:16 AM, Mario Valle <mvalle@cscs.ch> wrote:

> Pardon me for the trivial question.
>
> In Turtle syntax the @lang tag syntax refers to BCP47 that states:
>
> language      = 2*3ALPHA            ; shortest ISO 639 code
>
> That is, the language code (I ignore all the variants here) should be 2 or
> 3 characters. Indeed ISO 639 (http://www.loc.gov/standards/
> iso639-2/php/code_list.php) lists both 2 and 3 chars codes (e.g.,
> English: 'en' and 'eng').
>
> But in all Turtle examples I have found the language code has 2 chars. Is
> it a requirement or is simply a tradition? This means, could I write
> "Pancake"@eng?
>
> The question arises because WordNet contains 3 chars codes, so to
> transform into triples, should/shouldn't I convert it to 2 characters?
>
> Thanks for your patience
>
>                                 mario
>
> --
> Ing. Mario Valle
> Swiss National Supercomputing Centre (CSCS)
> v. Trevano 131, 6900 Lugano, Switzerland
> Tel: +41 (91) 610.82.60
>
>
Received on Tuesday, 20 December 2016 22:36:53 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:41:54 UTC