Re: BPMLOD and string metadata

> On Feb 2, 2023, at 2:36 AM, felix@sasakiatcf.com wrote:
> 
> Dear Christian and all,
> 
> I agree that currently there is a disconnect between the stakeholders. One technical step to take would be to provide BCP 47 identifiers as URIs, ideally even as RDF based URIs, so that others can attach to the URIs the missing metadata and re-use them in other contexts.
> 
> I tried to argue for that in the i18n WG, but we did not proceed so far, also or mainly because of responsibilities: who should host such URIs, the IETF or W3C or the Unicode consortium? Or should we just write a description how to construct the URIs? Maybe this thread helps to re-animate the discussion.

There’s an open issue [1] on planned updates to RDF Concepts from the RDF-star working group. This considers a couple of ways to handle text direction in RDF including the Compound Literal [2] and i18n namespace [3] experimental features from JSON-LD 1.1, which were constrained by compatibility with RDF 1.1. RDF 1.2 is focused on making annotations on RDF statements, and there’s a proposal that could leverage this, in addition to better formalizing the other mechanisms. I don’t expect the RDF-star group to have too much bandwidth to focus on this now, but we’ll need to do something for RDF Concepts and related recommendations (about 21 in all).

Gregg

[1] https://github.com/w3c/rdf-concepts/issues/9
[2] https://www.w3.org/TR/json-ld11/#the-rdf-compoundliteral-class-and-the-rdf-language-and-rdf-direction-properties
[3] https://www.w3.org/TR/json-ld11/#the-i18n-namespace


> Best,
> 
> Felix
> 
> Am 2023-02-02 10:58, schrieb Christian Chiarcos:
>> Dear Richard, dear all,
>> just skimming through your documents, I was wondering how the
>> recommended [3] metadata approach looks like in practice. Would the
>> general recommendation be to use language indexing [4], then? I see
>> some issues with that because the same concept can have multiple
>> lexicalizations in the same language (say, "Severe acute respiratory
>> syndrome coronavirus 2"@en alongside "SARS‑CoV‑2"@en, "Wuhan
>> Corona virus"@en, etc.), but the use of a dict here implies you get
>> one string per language max.
>> Also, are there any constraints or recommendations about the metadata
>> vocabulary (apologies if I overlooked) ? From the linguistic side,
>> BCP47 has been criticized a lot because people would like to add more
>> metadata than ISO 632 or BCP47 support (Gillis-Webber & Tittel 2019,
>> 2020), BCP47 covers ISO 632-1 and ISO 632-2 only, but not ISO 632-3
>> (which is needed for "smaller" languages), ISO 632-3 is insufficient
>> by itself (so that people introduce alternative classifications, e.g.,
>> Nordhoff et al. 2011), and most people seem to actually prefer to
>> identify languages by URIs in order to provide explicit metadata (De
>> Melo 2015, Nordhoff et al. 2011).
>> So far, it seems this discussion in the LLOD community is largely
>> detached from the discussion in the W3C Internationalization Working
>> Group, but these things should definitely be connected to get the
>> perspectives of spec developers, providers and consumers of
>> linguistic/language data covered. Thank you for taking the initiative!
>> Best,
>> Christian
>> Refs:
>> Gillis-Webber, F., & Tittel, S. (2019). The shortcomings of language
>> tags for linked data when modeling lesser-known languages. In _2nd
>> Conference on Language, Data and Knowledge (LDK 2019)_. Schloss
>> Dagstuhl-Leibniz-Zentrum fuer Informatik.
>> Gillis-Webber, F., & Tittel, S. (2020, May). A framework for shared
>> agreement of language tags beyond ISO 639. In _Proceedings of the
>> Twelfth Language Resources and Evaluation Conference_ (pp. 3333-3339).
>> De Melo, G. (2015). Lexvo. org: Language-related information for the
>> linguistic linked data cloud. _Semantic Web_, _6_(4), 393-400.
>> Nordhoff, S., & Hammarström, H. (2011). Glottolog/Langdoc: Defining
>> dialects, languages, and language families as collections of
>> resources. In _First International Workshop on Linked Science 2011-In
>> conjunction with the International Semantic Web Conference (ISWC
>> 2011)_.
>> Am Do., 2. Feb. 2023 um 09:57 Uhr schrieb Jorge Gracia del Río
>> <jogracia@unizar.es>:
>>> Dear Richard,
>>> Thanks for this update! We will certainly take a closer look at the
>>> report
>>> Best,
>>> Jorge
>>> El mié, 1 feb 2023 a las 18:14, r12a (<ishida@w3.org>) escribió:
>>>> dear BPMLOD folks,
>>>> Best wishes for your relaunch!
>>>> Since the last round of work on BPMLOD the W3C
>>>> Internationalization Working Group has spent a lot of time talking
>>>> with spec developers about how to attach metadata to strings to
>>>> indicate the language and the directionality of the string.  For
>>>> example, JSON LD adopted some new approaches to allow the
>>>> management of this information.[1]  I wonder whether this is
>>>> something that would be of interest to the BPMLOD group.
>>>> We produced a document called Strings on the Web: Language and
>>>> Direction Metadata (https://w3c.github.io/string-meta/ [1]) which
>>>> gives an overview of our current thinking.
>>>> best regards,
>>>> Richard
>>>> [1] https://www.w3.org/TR/json-ld11/#string-internationalization
>>>> [2]
>> Links:
>> ------
>> [1]
>> https://urldefense.com/v3/__https://w3c.github.io/string-meta/__;!!D9dNQwwGXtA!Rgepxj7QNGkaui_sSstuffPD7xC42Z6-Te9byilqDIDG0ByuYwhfbhg8QcGhfw2zkKknCuRt4oXLKQ$
>> [2]
>> https://urldefense.com/v3/__https://www.w3.org/TR/json-ld11/*string-internationalization__;Iw!!D9dNQwwGXtA!Rgepxj7QNGkaui_sSstuffPD7xC42Z6-Te9byilqDIDG0ByuYwhfbhg8QcGhfw2zkKknCuSeM8ekBQ$
>> [3] https://w3c.github.io/string-meta/#language-metadata
>> [4] https://w3c.github.io/string-meta/#localization-considerations
> 

Received on Tuesday, 7 February 2023 00:18:51 UTC