W3C home > Mailing lists > Public > public-egov-ig@w3.org > January 2010

Re: Ed and Outreadch Opportunity

From: Chris Beer <chris-beer@grapevine.net.au>
Date: Thu, 28 Jan 2010 23:05:51 +1100
Message-ID: <4B617D9F.10304@grapevine.net.au>
To: David Pullinger <David.Pullinger@coi.gsi.gov.uk>
CC: Kevin' 'Novak <KevinNovak@aia.org>, Joe Carmel <joe.carmel@comcast.net>, 'Brian Gryth' <briangryth@gmail.com>, 'eGovIG IG' <public-egov-ig@w3.org>
Hey Brian, everyone

Wouldn't be right if I didn't pop the TF4 hat on and respond into the 
conversation ;) I already sent Brian an email offering to assist, but 
since we're doing this in list... :)

Personally and professionally,  I have issues with "data", if not any 
government information, being published in PDF formats as well as how 
PDF files are used in general, not only by Gov, but by the Private 
sector as well.

IMO The only three reasons (and only if you had to) to use PDF is a) as 
an archive snapshot of a document and b) for document control - that is 
- when you don't want a document to be altered by users such as in the 
case of a manifestation or publication of a piece of legislation, 
tenders etc - hence why you can embed digital signatures, lock them from 
editing, etc etc. and c) With accessible Smart Forms, which are actually 
just such a cool idea and so very useful as an assistive technology, and 
for both the user, and the owner - that said these all still have issues 
around being in PDF.

The general usage, however, seems to be for anything and everything that 
can be published. Want a printable version? Download the PDF file.

Rather than focus on the pitfalls of using PDF's in the .gov.* space 
(which I'm more than happy to discuss with anyone - especially David in 
light of his comments ;) ), I'll focus on the topic at hand. I've had a 
look at the Fiscal Note Brian provided as well as the proposed Act and 
I'm a little stunned by the leap of logic in this sense.

A careful reading of the Bill reveals that throughout, information is 
required to be "posted on-line, in a downloadable format". Now if I was 
a clever Web Manager in charge of implementing my local schools 
requirements under this bill, I could quickly and easily meet these 
requirements through a CMS enabled website/database - the act of viewing 
a webpage is, by definition, downloading information. Not only that, but 
I could point at my model and highlight the fact that:

a) The data supports RDF(a), XML, StratML etc in a far more useful and 
usable format than a PDF version

b) I can send my schemas to other schools, or even the Department (who 
might want to create a centralised model) to enable consistancy of data 
formatting, not just a pretty view of the data

c) I can deliver my data in a range of open standard formats, from such 
as binary, CSV, HTML, XML, etc using very basic, free, vendor 
independant and accessible technologies

d) I can export a customisable view of this data on demand as a PDF file 
/if/ needed... (think the export as PDF function of Google Analytics 
dashboard reports.) But I can also export it in a variety of other 
propriety formats on demand.

e) I can very easily track the usage and access of this data by the 
public through web analytics. If I track it well enough, and agressively 
enough, I can start to analyse which parts of the data are the most 
useful (for instance I might well find that visits from .edu domains 
(ie: teachers) show a marked interest in salary schedule comparisons) 
and I can tailor the solution from a push Web 1.0 model to a information 
on demand Web 2.0 model.)

f) I can allow others, including other arms of Local, State and Federal 
Governments, through API's and mashups, to mix my data with other data 
to provide interesting information - like financial data mapped against 
student result averages.

A couple of other things to consider with the financial and workload 
aspects in mind, is that technically (and correct me if I am wrong) each 
and every PDF release of this data would be classed as a government 
publication and will require not only ISBN numbers etc, but entry into 
the Library of Congress or State equivalent, catalogues as well. A 
single website, being considered as an Intergrated Resource, technically 
would require only a single catalogue entry...

The Fiscal Note also reads "It is assumed that financial documents can 
be electronically converted into a portable document format (PDF) or 
image file (tiff, gif, jpg), and posted online at minimal cost, and that 
software to convert documents and software to modify websites is readily 
available at the district level."

Now thats an interesting assumption - and it is just that - an 
assumption. Considering publishing the information as HTML etc is 
effectively free.

These are only some initial thoughts, but you get the idea. Happy to 

David - would love to discuss your thoughts around the standards and 
governance on PDF, but it'd probably off topic in this thread. Drop me a 
line and expand on things :)



David Pullinger wrote:
> Both,
> As well as separate data files, it is perfectedly possible to embed 
> RDF(a) into PDF files, as other markup, and so provide access to 
> Linked Data thereby...
> We're considering whether or not to issue standards in this area so 
> that those who insist on releasing information in PDF files 
> nevertheless don't put a block on Linked Data.
> David
> David Pullinger
> david.pullinger@coi.gsi.gov.uk <mailto:david.pullinger@coi.gsi.gov.uk>
> Head of Digital Policy
> Central Office of Information
> Hercules House
> 7 Hercules Road
> London SE1 7DU
> 020 7261 8513
> 07788 872321
> Twitter #digigov and blogs:  www.coi.gov.uk/blogs/digigov 
> <http://www.coi.gov.uk/blogs/digigov>
> >>> "Joe Carmel" <joe.carmel@comcast.net> 26/01/2010 18:56 >>>
> Brian,
> One option to consider might be XForms (and XSLTForms in particular).  
> Although I’m not familiar with the school district financial data, it 
> seems like publishing an XForm on a central website and mandating that 
> school districts fill it out would be easy to create, maintain, and 
> implement.  The output files could then be posted centrally and/or 
> locally.    
> I’m working with Owen Ambur and several others on something like this 
> for StratML.  Check out 
> http://www.xmldatasets.net/XF2/stratmlxform3.xml.   It’s still being 
> developed but it might serve as an example.  The idea is to provide a 
> way to create, import, update, display, and finally catalog StratML 
> fles across the web.   
> Joe
> *From:* public-egov-ig-request@w3.org 
> [mailto:public-egov-ig-request@w3.org] *On Behalf Of *Novak, Kevin
> *Sent:* Tuesday, January 26, 2010 12:13 PM
> *To:* Brian Gryth; eGovIG IG
> *Subject:* RE: Ed and Outreadch Opportunity
> Brian,
> I am here to help you.
> I can provide input and opinion on the piece you are developing. I 
> concur with your assessment of PDF. Other options in addition must be 
> considered.
> Kevin
> *From:* public-egov-ig-request@w3.org 
> [mailto:public-egov-ig-request@w3.org] *On Behalf Of *Brian Gryth
> *Sent:* Tuesday, January 26, 2010 12:08 PM
> *To:* eGovIG IG
> *Subject:* Ed and Outreadch Opportunity
> Good day all,
> Members of the Colorado General Assembly introduced legislation 
> recently that would mandate school districts to publish certain 
> financial data in a down loadable format.  The bill is HB10-1036 and 
> is available at http://legislink.org/us-co?HB10-1036.  This is a good 
> thing on the surface.  What concerns me is the fiscal impact statement 
> associated with the legislation.  The concerning part of the fiscal 
> impact statement focuses on the information being released in PDF 
> or in an image format (e.g. JPEG, TIFF, GIF), but does not talk about 
> other formats.  The fiscal note is available at http://bit.ly/80RBiu.  
> As has been discussed by this group and in other places, PDF only 
> publication is not the best method of publishing government data.
> Therefore, I saw this as a perfect opportunity for some education and 
> outreach.  I am planning on putting some summarized information 
> together that will discuss data publication methods to sent to the 
> bill sponsors and other members of the Colorado legislature.  I also 
> plan on speaking at the Senate hearing for the bill as a concerned 
> citizen.
> I would appreciate the assistance of anyone wishing to help me out.  
> Please feel free to e-mail me and I will share a Google Doc I will be 
> using to draft the materials.
> Thanks
> Brian
> This communication is confidential and copyright.
> Anyone coming into unauthorised possession of it should disregard its 
> content and erase it from their records.
> The original of this email was scanned for viruses by Government 
> Secure Intranet (GSi) virus scanning service supplied exclusively by 
> Cable & Wireless in partnership with MessageLabs.
> On leaving the GSI this email was certified virus free.
> The MessageLabs Anti Virus Service is the first managed service to 
> achieve the CSIA Claims Tested Mark (CCTM Certificate Number 
> 2006/04/0007), the UK Government quality mark initiative for 
> information security products and services. For more information about 
> this please visit www.cctmark.gov.uk
Received on Thursday, 28 January 2010 12:06:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:00:42 UTC