Making data collection less rude

The granularity of language encoding can be used to make data collection less rude. Depending on what data is being collected, and how it is being collected, systems of language codes can easily overreach their usefulness.  The primary function of Open Government websites is to attract data tourists.

For example, some US Government Agency (I forget the exact name) involved with geo-location also uses the ISO 639 three character codes (347 of them).  This is public information.  Open Government websites "write" in the ISO 639 two character codes (151 of them).  A sample of 150+ Government websites showed only about 68 of these two letter codes actually in use.  The codes can be reduced with an SQL table. There is very little need to attempt a SPARQL solution.

http://www.rustprivacy.org/faca/languages.php

Drop Down Lists (for example) and SQL,XML,CSV versions of the table are available for download.

--Gannon 

Received on Tuesday, 29 October 2013 22:37:18 UTC