G2G, B2C

This document was mentioned to me on the LOD Conference Call That Never Was on Friday [1].  In reference to "2. Definitions" and "Seamless Integration of Data", one of the challenges for Government Data-sets is that the multi-tier location hierarchies can leave the user swamped in irreconcilable tree fragments.

A SKOS approach, turning the search keys into URL's, works well.  This "Big Picture" is all Public Domain but AFAIK, the UN, ISO, WIPO, EU, W3C etc. have not gotten 'round to constructing the Triple Store.  The US CIA does a very nice job with this in the World Factbook [2].  Obviously they are not in the web standards business, but have a strong motivation to accuracy. DBpedia encodes all countries and subdivisions at the same level and is unsuitable for code translation.

This interpretation of the DBpedia nodes does allow a search, with an automatic code translation [3]: 

Top Level Domains are encoded like this:
skos:TopConcept            // Top Level Domain (TLD)               
dbp:Government             //  GovernmentOf [Entity]
dcam:memberOf              // Top Level Code/Type URL
dct:title(en)              // title
dct:alternative(en)        // ISO Official Name (English)
dct:alternative(fr)        // ISO Official Name (French)
dbp:Seat_of_Government     // Capital
geo:lat                    // Latitude (Capital)
geo:lon                    // Longitude (Capital)
dbp:UTC                    // Time Zone (Capital)
dct:created                // date (entered)

Subdivisions are encoded exactly the same way ... except:

1. Subdivisions are skos:Concept
2. Below the TLD level the top level Government Entity controls "Official Names" including title.  ISO Official names may or may not exist.
3. The "Capital" should be any major city, or central point with a name.
4. The dcam:memberOf should be a CURIE. This allows entities to be demoted from skos:TopConcept to skos:Concept although they may have been assigned ISO 3166-1 codes.  Jersey and Guernsey will accept the demotion with grace, one hopes.
5. Commercial Entities are required to be subdivisions of a TLD.  Unlike the ISO 3166-1 system, there is an explicit extra-territorial (High Seas) code.  Organizations and Communities have a choice, they can be bound to a Government or be bound to the High Seas.  Commercial Entities also have this "choice", but it is doubtful that an explicit acknowledgement of pirate status would be too good for business.

Set up like this, a join (search) of dcam:memberOf gives a list of related Top Concepts.  A standard method of locating Data Sets would have a number of benefits, but the two big ones are:

1.  Existing coding systems can continue.  The "retrofit" of Civil Subdivisions is no more trouble than using the place names to begin with.
2.  The Privacy Issue.  The use of codes has a "hidden" privacy benefit. When citizens need Government services, certainly there is a business need for an exact location or residence address.  However when Government needs Citizens' "services", e.g. a vote, a citizen need only be located by voting precinct - a much bigger area. The point is that consumers as a class can tolerate a very high uncertainty in location of the provider class.  The reverse is not true - the provider class requires low uncertainty.  PSI changes sign, so to speak, so that a "provider" of statistics (Government) has the motivation of a "consumer" of raw data. The Private Sector equivalent is the publishing of financial data, and this is done with the high location uncertainty characteristic of a consumer role.  When reporting statistics, regional granularity may be uncomfortably broad, but nonetheless it exists.

--Gannon

[1] http://www.w3.org/TR/2009/NOTE-egov-improving-20090512/
[2] https://www.cia.gov/library/publications/the-world-factbook/
[3] http://www.rustprivacy.org/sun/spookville/





      

Received on Sunday, 26 September 2010 18:30:17 UTC