W3C home > Mailing lists > Public > public-egov-ig@w3.org > December 2009

RE: Public Data Catalog Priorities and Demand

From: Joe Carmel <joe.carmel@comcast.net>
Date: Fri, 18 Dec 2009 10:10:48 -0500
To: "'Antti Poikola'" <antti.poikola@gmail.com>, "'Jonathan Gray'" <jonathan.gray@okfn.org>
Cc: "'Steven Clift'" <clift@e-democracy.org>, <public-egov-ig@w3.org>, <sunlightlabs@groups.google.com>, "'Acar, Suzanne'" <Suzanne.Acar@ic.fbi.gov>
Message-ID: <002f01ca7ff4$47e1ab40$d7a501c0$@carmel@comcast.net>
I totally agree with you Antti.  I think data.gov and other government
websites should be looking to use a standards-based data cataloging format
(e.g., extending AtomXML or OPDS) that allows entries link to be data files
or other catalogs.  Similar to sitemaps and HTML, governments would publish
a file at the root of their websites that provides a catalog to the data
files on their site.  By enabling the catalog format to point to other
catalogs, a root catalog could point to sub-department level catalogs
allowing data catalog management responsibilities to be distributed within
an organization.  

At present, governments use HTML in a variety of ways for data cataloging.
This looser approach has made it difficult to get one's arms around all of
the data being published at a given site. (e.g,
http://www.atlantis-press.com/php/download_paper.php?id=1763).  IMO, if a
standard data catalog format was used it would presumably be with XML which
would enable individual catalogs to "look" different from one site to
another (using CSS or XSL), but the underlying data structures would be the
same--allowing for machine readability.

By providing access to remote data storage, the Internet has been used to
publish data and documents.  Standard file names (index.htm, main.htm) are
used as HTML entry points for websites.  The default HTML file then uses
hypertext links to provide access to subsequent files.  In the same way HTML
provides links to any file, I believe that standardized catalog files
pointing to sub-catalogs and data files could enable a more searchable and
usable web of data.  

Joe

-----Original Message-----
From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org]
On Behalf Of Antti Poikola
Sent: Friday, December 18, 2009 1:10 AM
To: Jonathan Gray
Cc: Steven Clift; public-egov-ig@w3.org; sunlightlabs@groups.google.com
Subject: Re: Public Data Catalog Priorities and Demand

Hi,

Please Jonathan, Steven and others, let us know if you find some 
visualization, categorization or prioritization that would clarify the 
"swamp" of public sector information sources.

I'm looking for two things:

1. A easy way to get the BIG PICTURE of what kind of public sector 
information most propably exists (even if it is not open yet)
in a typical country or city.

2. Some priorities from the information re-users point of view

So far I have found only listings and catalogues that can be re-ordered 
according to some topics (for example CKAN and data.gov), but these are 
not really helping to give the big picture. From this kind of catalogues 
it is easy to find some specific data source if you know what you are 
looking for, but if you just want to see what is out there and build the 
overview the catalogues are not so helpful.

Best regards

-Antti "Jogi" Poikola


Jonathan Gray kirjoitti:
> Just to let you know, we're currently working on this with CKAN.net.
> Also very interested in thinking about how we can track how different
> datasets are reused.
>
> Jonathan
>
> On Mon, Nov 23, 2009 at 4:20 PM, Steven Clift <clift@e-democracy.org>
wrote:
>   
>> Has anyone explored what government data is in highest "demand" on the
>> emerging public data reuse sites? How does interest from different
>> re-user audiences vary (e.g.  business, media, open gov advocates,
>> independent coders, etc.)
>>
>> Also, has anyone started a comparsion chart of what different
>> governments are providing? It would be interesting to quickly see what
>>  different national or local governments are providing now and over
>> time. This gets to the "what's important" to release for easy reuse
>> versus what is the easiest or least politically sensitive.
>>
>> Steven Clift
>> E-Democracy.org
>>
>> --
>> Steven Clift - http://stevenclift.com
>>  Executive Director - http://E-Democracy.Org
>>  Follow me - http://twitter.com/democracy
>>
>>
>>     
>
>
>
>   
Received on Friday, 18 December 2009 15:11:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 18 December 2009 15:11:14 GMT