W3C home > Mailing lists > Public > public-egov-ig@w3.org > January 2010

RE: Ed and Outreadch Opportunity

From: Owen Ambur <Owen.Ambur@verizon.net>
Date: Fri, 29 Jan 2010 15:50:20 -0500
To: "'eGovIG IG'" <public-egov-ig@w3.org>
Message-id: <003401caa124$ac0df190$0429d4b0$@Ambur@verizon.net>
Dave makes an important point that often "glossed" over in the heat of
debate, which is that *style* is commonly valued more highly than
*substance*.  However, that is not to suggest "style" (presentation) does
not matter at all.  

 

Indeed, in terms of records, presentation is *critical*.  Data without
context is meaningless.  And it is impossible to know what some intended
when they did what they did if we cannot see *exactly* what they saw when
they did it.  That's why the U.S. Courts use PDF/A and have led its
establishment as an international standard.  That's also how the techie
weenies have led us badly astray -- by pursuing the "separation of content
from presentation" as the holy grail.  Separating content from its
presentation is one way, if not the primary way, of creating records lacking
integrity and, thus, *enabling* waste, fraud, and abuse.

 

However, an even more  relevant point is that this debate represents a
*false choice* -- between highly stylized presentation versus
machine-readable, readily referenceable and reusable content.

 

Having converted more than 500 strategic plans from PDF to StratML format, I
can tell you is a real pain in the tail, as well as highly unnecessary -
since StratML documents (XML files) can automatically be converted to PDF
but the reverse is not true.  

 

See, for example, the PDF renditions in XML Simplicity's StratML Editorial
Portal at http://xml.gov/stratml/index.htm#XMLSimplicity or, more
specifically, http://xml-simplicity.com/pls/apex/f?p=137:1:3400835562013304


 

See also Adobe's PDF Fillable form for StratML documents, at
http://xml.gov/stratml/index.htm#Forms or, more specifically,
http://xml.gov/stratml/forms/StrategicPlan.pdf  It creates StratML (XML)
files, which can be automatically transformed into PDF as desired (as well
as HTML, RTF, and other formats, as demonstrated in XML Simplicity's StratML
portal). 

 

However, I am not among those who will bash Adobe for developing PDF.
Indeed, I laud them for addressing a real (business) need as well as
recognizing the need to establish it as an open, international standard.  

 

By contrast, I chide the W3C for failing to consider PureEdge's proposal to
establish eXtensible Forms Description Language (XFDL) as a W3C
Recommendation.  Shame on the W3C!   And that applies doubly to anyone who
criticizes Adobe and PDF for addressing a need the W3C chose to ignore.

 

The bottom line, as Dave suggests, is that this is not a debate that needs
to occur.  The correct answer is clearly not either/or but, rather, *both*
substance *and* style.  However, as Stephen Covey has noted first things
should be addressed first, and let me assure you that I am going to be one
ticked off taxpayer if .gov agencies continue to insist upon flaunting style
over substance in publishing their strategic and performance plans
(including their open gov plans).

 

Owen

 

From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org]
On Behalf Of Dave McAllister
Sent: Friday, January 29, 2010 9:45 AM
To: David Pullinger; chris-beer@grapevine.net.au
Cc: Kevin' 'Novak; Joe Carmel; 'Brian Gryth'; 'eGovIG IG'
Subject: Re: Ed and Outreadch Opportunity

 

Well, I admit having the usual gut-check that "here we go again".  PDF (an
ISO standard for document presentation, ISO 32000-1, with -2 development
underway) does offer the ability to include the raw data and information. It
isn't often used, since the goal of "presentation" of final format is often
considered more important than access to raw data. 

To me, it's a matter of education rather than format.  

Davemc


-- 
Dave McAllister
Director, Open Source and Standards 
Core Services, Experience and Technology Organization
Adobe Systems, Inc.
W: +1 408-536-3881
GC: +1 650-523-4942 

Linux Journal Editorial Adviser, http://www.linuxjournal.com


On 1/29/10 6:27 AM, "David Pullinger" <David.Pullinger@coi.gsi.gov.uk>
wrote:

Chris,
 
Let me assure you that I'm not in favour of PDF for data or communication,
the critical words were ...'those who insist on..'   Let me draw a
comparison.  The government is not in favour of people taking drugs.  But we
provide information to help those who do.  Our friends at Adobe should not
draw the analogy too far as I just mean that sometimes we engage in harm
reduction - in this case to get at good re-usable data.  
 
David
 
 
 
 
 
David Pullinger
david.pullinger@coi.gsi.gov.uk
Head of Digital Policy
Central Office of Information
Hercules House
7 Hercules Road
London SE1 7DU
020 7261 8513
07788 872321

Twitter #digigov and blogs:  www.coi.gov.uk/blogs/digigov
<http://www.coi.gov.uk/blogs/digigov> 



>>> Chris Beer <chris-beer@grapevine.net.au> 28/01/2010 12:05 >>>
Hey Brian, everyone

Wouldn't be right if I didn't pop the TF4 hat on and respond into the
conversation ;) I already sent Brian an email offering to assist, but since
we're doing this in list... :)

Personally and professionally,  I have issues with "data", if not any
government information, being published in PDF formats as well as how PDF
files are used in general, not only by Gov, but by the Private sector as
well.

IMO The only three reasons (and only if you had to) to use PDF is a) as an
archive snapshot of a document and b) for document control - that is - when
you don't want a document to be altered by users such as in the case of a
manifestation or publication of a piece of legislation, tenders etc - hence
why you can embed digital signatures, lock them from editing, etc etc. and
c) With accessible Smart Forms, which are actually just such a cool idea and
so very useful as an assistive technology, and for both the user, and the
owner - that said these all still have issues around being in PDF. 

The general usage, however, seems to be for anything and everything that can
be published. Want a printable version? Download the PDF file.

Rather than focus on the pitfalls of using PDF's in the .gov.* space (which
I'm more than happy to discuss with anyone - especially David in light of
his comments ;) ), I'll focus on the topic at hand. I've had a look at the
Fiscal Note Brian provided as well as the proposed Act and I'm a little
stunned by the leap of logic in this sense.

A careful reading of the Bill reveals that throughout, information is
required to be "posted on-line, in a downloadable format". Now if I was a
clever Web Manager in charge of implementing my local schools requirements
under this bill, I could quickly and easily meet these requirements through
a CMS enabled website/database - the act of viewing a webpage is, by
definition, downloading information. Not only that, but I could point at my
model and highlight the fact that:

a) The data supports RDF(a), XML, StratML etc in a far more useful and
usable format than a PDF version

b) I can send my schemas to other schools, or even the Department (who might
want to create a centralised model) to enable consistancy of data
formatting, not just a pretty view of the data

c) I can deliver my data in a range of open standard formats, from such as
binary, CSV, HTML, XML, etc using very basic, free, vendor independant and
accessible technologies

d) I can export a customisable view of this data on demand as a PDF file if
needed... (think the export as PDF function of Google Analytics dashboard
reports.) But I can also export it in a variety of other propriety formats
on demand.

e) I can very easily track the usage and access of this data by the public
through web analytics. If I track it well enough, and agressively enough, I
can start to analyse which parts of the data are the most useful (for
instance I might well find that visits from .edu domains (ie: teachers) show
a marked interest in salary schedule comparisons) and I can tailor the
solution from a push Web 1.0 model to a information on demand Web 2.0
model.)

f) I can allow others, including other arms of Local, State and Federal
Governments, through API's and mashups, to mix my data with other data to
provide interesting information - like financial data mapped against student
result averages.

A couple of other things to consider with the financial and workload aspects
in mind, is that technically (and correct me if I am wrong) each and every
PDF release of this data would be classed as a government publication and
will require not only ISBN numbers etc, but entry into the Library of
Congress or State equivalent, catalogues as well. A single website, being
considered as an Intergrated Resource, technically would require only a
single catalogue entry...

The Fiscal Note also reads "It is assumed that financial documents can be
electronically converted into a portable document format (PDF) or image file
(tiff, gif, jpg), and posted online at minimal cost, and that software to
convert documents and software to modify websites is readily available at
the district level."

Now thats an interesting assumption - and it is just that - an assumption.
Considering publishing the information as HTML etc is effectively free.

These are only some initial thoughts, but you get the idea. Happy to
discuss.

David - would love to discuss your thoughts around the standards and
governance on PDF, but it'd probably off topic in this thread. Drop me a
line and expand on things :)

Cheers

Chris




David Pullinger wrote: 

Both,
 
As well as separate data files, it is perfectedly possible to embed RDF(a)
into PDF files, as other markup, and so provide access to Linked Data
thereby...
 
We're considering whether or not to issue standards in this area so that
those who insist on releasing information in PDF files nevertheless don't
put a block on Linked Data.
 
David
 
 
 
 
 
David Pullinger
david.pullinger@coi.gsi.gov.uk
Head of Digital Policy
Central Office of Information
Hercules House
7 Hercules Road
London SE1 7DU
020 7261 8513
07788 872321

Twitter #digigov and blogs:  www.coi.gov.uk/blogs/digigov
<http://www.coi.gov.uk/blogs/digigov> 



>>> "Joe Carmel" <joe.carmel@comcast.net> <mailto:joe.carmel@comcast.net>
26/01/2010 18:56 >>>
Brian,
One option to consider might be XForms (and XSLTForms in particular).
Although I'm not familiar with the school district financial data, it seems
like publishing an XForm on a central website and mandating that school
districts fill it out would be easy to create, maintain, and implement.  The
output files could then be posted centrally and/or locally.    
I'm working with Owen Ambur and several others on something like this for
StratML.  Check out http://www.xmldatasets.net/XF2/stratmlxform3.xml.   It's
still being developed but it might serve as an example.  The idea is to
provide a way to create, import, update, display, and finally catalog
StratML fles across the web.    
Joe

From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org]
On Behalf Of Novak, Kevin
Sent: Tuesday, January 26, 2010 12:13 PM
To: Brian Gryth; eGovIG IG
Subject: RE: Ed and Outreadch Opportunity
Brian,
I am here to help you.
I can provide input and opinion on the piece you are developing. I concur
with your assessment of PDF. Other options in addition must be considered.
Kevin
From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org]
On Behalf Of Brian Gryth
Sent: Tuesday, January 26, 2010 12:08 PM
To: eGovIG IG
Subject: Ed and Outreadch Opportunity

Good day all,

Members of the Colorado General Assembly introduced legislation recently
that would mandate school districts to publish certain financial data in a
down loadable format.  The bill is HB10-1036 and is available at
http://legislink.org/us-co?HB10-1036.  This is a good thing on the surface.
What concerns me is the fiscal impact statement associated with the
legislation.  The concerning part of the fiscal impact statement focuses on
the information being released in PDF or in an image format (e.g. JPEG,
TIFF, GIF), but does not talk about other formats.  The fiscal note is
available at http://bit.ly/80RBiu.  As has been discussed by this group and
in other places, PDF only publication is not the best method of publishing
government data.

Therefore, I saw this as a perfect opportunity for some education and
outreach.  I am planning on putting some summarized information together
that will discuss data publication methods to sent to the bill sponsors and
other members of the Colorado legislature.  I also plan on speaking at the
Senate hearing for the bill as a concerned citizen.

I would appreciate the assistance of anyone wishing to help me out.  Please
feel free to e-mail me and I will share a Google Doc I will be using to
draft the materials.

Thanks
Brian

This communication is confidential and copyright.
Anyone coming into unauthorised possession of it should disregard its
content and erase it from their records.

The original of this email was scanned for viruses by Government Secure
Intranet (GSi) virus scanning service supplied exclusively by Cable &
Wireless in partnership with MessageLabs.
On leaving the GSI this email was certified virus free.
The MessageLabs Anti Virus Service is the first managed service to achieve
the CSIA Claims Tested Mark (CCTM Certificate Number 2006/04/0007), the UK
Government quality mark initiative for information security products and
services. For more information about this please visit www.cctmark.gov.uk
<http://www.cctmark.gov.uk> 



This communication is confidential and copyright.
Anyone coming into unauthorised possession of it should disregard its
content and erase it from their records.

The original of this email was scanned for viruses by Government Secure
Intranet (GSi) virus scanning service supplied exclusively by Cable &
Wireless in partnership with MessageLabs.
On leaving the GSI this email was certified virus free.
The MessageLabs Anti Virus Service is the first managed service to achieve
the CSIA Claims Tested Mark (CCTM Certificate Number 2006/04/0007), the UK
Government quality mark initiative for information security products and
services. For more information about this please visit www.cctmark.gov.uk
Received on Friday, 29 January 2010 20:51:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 29 January 2010 20:51:27 GMT