W3C home > Mailing lists > Public > public-egov-ig@w3.org > February 2012

Artificial Bureaucracy - Language Codes

From: Gannon Dick <gannon_dick@yahoo.com>
Date: Mon, 27 Feb 2012 15:02:47 -0800 (PST)
Message-ID: <1330383767.34800.YahooMailNeo@web112609.mail.gq1.yahoo.com>
To: "eGov IG \(Public\)" <public-egov-ig@w3.org>
Cc: Carsten Keler <carsten.kessler@uni-muenster.de>, Phil Archer <phila@w3.org>, Stijn Goedertier <stijn.goedertier@pwc.be>, Chad Hendrix <hendrix@un.org>

"Artificial Bureaucracy" is like Artificial Intelligence (AI) for Civil Servants. A very important tool for a bureaucrat are codes and standards - a language only they speak. The codes used in the standards function as encryption to keep out "the enemy" both foreign and domestic, and BTW, that includes citizens. Since every e-Gov uses only a small subset of Country (ISO 3166), Language (ISO 639) and Currency (ISO 4217) - one size does fit all - it is practical to make up a Repository Profile from a single DCMI subject list, and without believing that everybody speaks English because that's the default language of the national website. But I'm getting a bit ahead ... The intent of "Artificial Bureaucracy" is to do away with the Codes in favor of Names (from an IT perspective, Name Tokens which are themselves language neutral).

The ISO 639 Language Codes present a special concern. There are two different sets of Name Tokens needed, and to mix them up is to invite false inferences about the meta data:
1. Naming the Website Display Page Language or the language of the text in a data set; and
2. Specifying a Property of a Person or Organization - a population or person speaks, reads writes, etc. a particular language.

Neither the The Core Vocabularies Working Group [1] nor theHumanitarian eXchange Language (HXL) [2] address this, AFAICT. This means to me that LOD'ers need it spelled out. The Specification makes the distinction, but not right out loud. The three letter codes have an extra {bibliographic|terminology} attribute. The two letter codes have no such attribute. So, for eGov work, the two letter codes refer to a display and the three letter bibliographic codes refer to a Person, with the three letter 'terminology' codes acting as alternates to the two letter codes.

It's fine to be friendly (and in the meantime promote tourism) on websites, but there are other circumstances, humanitarian causes for example, where more accuracy is necessary. Any thoughts (while I go home and eat dinner, and leave details until tomorrow) ?


[1] http://joinup.ec.europa.eu/asset/core_business/document/core-vocabularies-working-group-members
[2] http://carsten.io/hxl/ns-2012-02-22/index.html
Received on Monday, 27 February 2012 23:03:16 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:43:20 UTC