Is Privacy Dead ? A helpful hint.

To answer my own question: No, but in the fashion of Governments everywhere, it is is buried deeply in reports [1,2].  Statistical Reports have a spatial and temporal coverage independent of population.  Population of what ?  Well, whatever, people (of course), fish, bananas, rocks, etc..  Whatever the group, the population number in scientific notation is always the same:
max(id) x (10^(-log10(max(id))) [.= 1 for every integer >= 1]
Sure, it's a trick, Chemists use it all the time calculating pH (acid-base balance, "concentration" = 1/(atomic population)).  Government Reports use the same trick, and the tree framework that supports the population bins - in the US: 
//Country/State/County/Population - can be pre-calculated (see XML below).

To scale this up to Planet size requires some other assumptions[3], however an immediate useful result is that numbering subdivisions with digits (0-9) is more trouble than it's worth.  Any three digit (A-Z) code is just as good and promotes accuracy.  In the example below, the civil entity "Parker County, Texas" is considerably easier to remember as "PAR" than as the Census numbering "367".  It does take a bit of work to weed out the homonyms, e.g. "Parmer County, Texas" (369), but it is a one time chore.

I am doing this for (at least) four reasons:
- To give clerks a break from people who only speak hexadecimal.
- To generalize Trade reporting.  This applies to both Trade in goods and Financial results (reported in currency).
- To demonstrate how Governments protect personal privacy, even if they don't want to, or don't mean to.
- To give establish a firm scientific rationale for ignoring the present Election Year foolishness[4].

--Gannon

(moral of the story: citizen anonymity is independent of location)
<dct:coverage xmlns:dct="http://purl.org/dc/terms/"
              xmlns:dcam="http://purl.org/dc/dcam/"
              xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
              dct:issued="2010-10-06T12:00:00-05:00"
              dct:temporal="P3M"
              maxid="1000"
              population="eval(max(id) x 10^(-log10(max(id)))">
<admin0 rdfs:label="United States (Country)" dcam:memberOf="0US" dct:alternative="222139P" >
  <admin1 rdfs:label="Texas (State)" dcam:memberOf="0TX" dct:alternative="170524P">
    <admin2 rdfs:label="Parker (County)" dcam:memberOf="PAR" dct:alternative="085327P">
      <citizen rdfs:label="Minnie from PAR" dct:identifier="1"  />
      <citizen rdfs:label="Max from PAR" dct:identifier="1000" />
    </admin2>
  </admin1>
</admin0>
</dct:coverage>

[1] http://www.fcsm.gov/working-papers/spwp22.html
[2] http://factfinder.census.gov/jsp/saff/SAFFInfo.jsp?_pageId=su5_confidentiality
[3] http://www.rustprivacy.org/sun/spookville/dct_coverage.xml
[4] My muse is the Kingston Trio, specifically "MTA".  It is on youtube.  Readers are encouraged to listen to the lyrics: Issuing a statistical report is like Charlie's Wife (A Government) throwing Lunch through the first open train window she sees.  Linked Data Consumers are Charlie.


      

Received on Wednesday, 6 October 2010 19:11:26 UTC