RE: Advancing the GCT Proposal from Dennis E. Hamilton on 2015-06-22 (public-change@w3.org from June 2015)

From: Dennis E. Hamilton <dennis.hamilton@acm.org>
Date: Mon, 22 Jun 2015 13:43:02 -0700
To: <public-change@w3.org>
Message-ID: <00e301d0ad2c$0993ecc0$1cbbc640$@acm.org>
In my reading (!) I want to stay out of the behavioral and participant issues and stick to what are the technical matters.

Point of clarification: I presume that "machine-readable," as used below, has to do with content discrimination, rather than format recognition.  

That CSV, JSON, XML, and spreadsheets with header columns (and of course database maintained tables then) are considered machine-readable and text files, PDF, etc., are considered not machine-readable is very clear.  (I happen to believe that there is a serious technical misunderstanding here, but that is probably beside the point in the context of CTMarkup.)

I don't want to get into a deep-end about what "readable" means, especially for an XML file.  I do observe that having explicit syntactical structuring is not the same as material being inherently comprehensible.  I think the presumption of machine-readability may be unfortunate in that there seems to be some magical thinking around what that makes inherently available.  To that extent, I find that particular definition of "machine-readable" to be unfortunate.

I presume that there is thought to be some crucial connection with embedded metadata (e.g., RDF and RDFa) and/or highly-structured organization.

I will examine the other references you have provided.

 - Dennis

-----Original Message-----
From: Owen Ambur [mailto:Owen.Ambur@verizon.net] 
Sent: Monday, June 22, 2015 12:45
To: public-change@w3.org
Subject: RE: Advancing the GCT Proposal

Dennis, see http://xml.fido.gov/stratml/drybridge/index.htm#M-13-13 --
particularly http://xml.fido.gov/stratml/carmel/M-13-13-IGwStyle.xml &
http://xml.fido.gov/stratml/carmel/iso/PODDwStyle.xml 

I am unaware of any explicit guidance to track changes in machine-readable
government data.  However, to the degree the objectives include openness,
interoperability, and data standardization, the generic case for change
tracking is equally applicable to more specific, valid XML documents
(records).  http://xml.fido.gov/stratml/carmel/M-13-13wStyle.xml#values_ 

Here is how OMB defines "machine-readable" in Circular A-11 (section
200-15):

 Machine Readable Format. Format in a standard computer language (not
English text) that can be read 
 automatically by a web browser or computer system. (e.g.; xml).
Traditional word processing 
 documents, hypertext markup language (HTML) and portable document
format (PDF) files are easily 
 read by humans but typically are difficult for machines to
interpret. Other formats such as extensible 
 markup language (XML), (JSON), or spreadsheets with header columns
that can be exported as comma 
 separated values (CSV) are machine readable formats. It is possible
to make traditional word processing 
 documents and other formats machine readable but the documents must
include enhanced structural 
 elements.
https://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/s
200.pdf 

See also https://en.wikipedia.org/wiki/Electronic_discovery#Types_of_ESI &
https://en.wikipedia.org/wiki/Electronic_discovery#Emerging_trends  

The latter comes pretty close to documenting the generic use case for change
tracking but treating the requirement generically increases the cost of
discovery and, thus, the cost of litigation.  It's one thing if .com's want
to behave badly and assume the risk not getting away with it, but it is
quite another for .gov agencies to waste the taxpayers' money doing so.
Enabling .gov officials to cover their "tracks" ENABLES waste, fraud, and
abuse.  Providing the means to effectively track and report the creation and
alteration of records is one small step to be taken in the right direction
to encourage better performance.  Another is creating and maintaining
records in valid XML format throughout their full life cycles.

While it would be unrealistic to think those things might be done in the
near future, to fail to aim in those directions is not only short-sighted
but also inexcusable. 

Owen

-----Original Message-----
From: Dennis E. Hamilton [mailto:dennis.hamilton@acm.org] 
Sent: Monday, June 22, 2015 12:44 AM
To: public-change@w3.org
Subject: RE: Advancing the GCT Proposal

Owen,

Thanks for the clarifying context.  

That's very useful.

Is this 2013-05-09
<https://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-maki
ng-open-and-machine-readable-new-default-government-> a definitive source
for XO 13624?

Do we know the stage that implementation is in?  Timelines in the order are
contingent on the issuance of an Open Data Policy by the OMB Director. 

I see Memorandum 13-13, <https://project-open-data.cio.gov/policy-memo/>,
evidently under active maintenance at
<https://github.com/project-open-data/project-open-data.github.io/commits/ma
ster>, although that seems to be about more than the policy, extending to
tools and data formats for facilitating consistent satisfaction of the
policy.  I assume this supports various notions of preservation, access,
reuse and repurposing.  There is active development.

This seems to be an extensive and comprehensive effort that goes much
farther than generalized change-tracking of XML Documents.  I am not clear
how the work of CTMarkup fits in that picture and what any sort of intercept
might be.  (I have not discerned a connection with StratML either, but have
not looked closely.)

Can you point us to any place where change-tracking mechanisms are called
out?

 - Dennis





-----Original Message-----
From: Owen Ambur [mailto:Owen.Ambur@verizon.net]
Sent: Sunday, June 21, 2015 21:05
To: public-change@w3.org
Subject: RE: Advancing the GCT Proposal

Dennis, see my responses to your questions below in [brackets].

Owen Ambur
Chair, AIIM StratML Committee
Co-Chair Emeritus, xml.gov CoP
Webmaster, FIRM

-----Original Message-----
From: Dennis E. Hamilton [mailto:dennis.hamilton@acm.org]
Sent: Tuesday, June 16, 2015 5:26 PM
To: public-change@w3.org
Subject: RE: Advancing the GCT Proposal

Well, Toto, we are not in Kansas anymore.

I have no idea what this is all about and I hope that it is not intended to
be a meaningful conversation on the CTMarkup list.

The only thing I see in this particular screed is a claim that
change-tracking and XML documents that are change-tracked be subject to
domain semantics.  

In terms of something actionable, I am not certain what the recommendation
is.  Could it be one of,

 1. A generic mechanism is useless?

[I don't believe a generic mechanism would be useless.  However, I
personally am more interested in seeing reality made of guidance like that
issued in Executive Order 13642 by President Obama, making
machine-readability the avowed *default* for government information.
http://xml.fido.gov/stratml/carmel/EOOMRDwStyle.xml]

[I recognize that others have different opinions and believe that CSV, JSON,
and non-standard XML may be "good enough for government work," but I believe
an unrecognized implication of the President's guidance is that XML schemas
(XSDs) should be specified for all Federal records series, including, for
example, Executive Orders.
http://www.archives.gov/research/start/how-records-grouped.html  Indeed, I
take the failure to specify such XSDs as yet another case of bureaucratic
double-speak compounded by the hubris of feeling empowered to direct others
to "do as I say, not as I do."]

[It seems to me the President's directive sets forth good practice not only
for agencies at all levels of government, worldwide, but also all
organizations whose documents should be matters of public record.]

[Again, I recognize that others may have other opinions, but my opinion is
not exactly new.  It is closely related to the two proposals I made that
prompted the Federal CIO Council to charter the xml.gov CoP in 2000 --
http://xml.fido.gov/documents/completed/genesis.htm -- and to ask me to
chair the group for the last six years of my career.]  

 2. A generic mechanisms is useless without an accommodation of domain
considerations?

[My sense is that you doth protest too much.  I haven't heard anyone suggest
that generic capabilities would be useless ... just that they fall short of
what is to be desired.  If such capabilities are the best that can be
provided now, I'd say full speed ahead.  Often it is impossible to move to
higher levels of maturity without first passing through the lower levels.]

 3. Domain considerations must be dealt with from the get-go? 

[No, not necessarily, but it would be nice.  Plus I suspect that
authoring/editing tools that are designed from the ground-up to deal with
valid XML instance documents may have the capability to track changes in
them anyway.  So to the degree that may be the case, the only question is
whether the documentation of such changes can be freely and accurately
shared across XML authoring/editing tools.  Toward that end, an XML change
tracking standard capable of dealing with valid XML instance documents would
be nice to have.]

 4. There is some domain of interest that folks have a shared interest in
seeing served?

[Again, I can't speak for anyone else, but I personally am intensely
interested in tools, apps, and services supporting the StratML standard (ISO
17469-1), whose vision is:  "A worldwide web of intentions, stakeholders,
and results."  If others do not share my interest in that vision, no harm,
no foul.]

[However, if they are willing to share their intentions with me, whatever
their intentions may be, I'll be happy to render them in StratML format for
inclusion in our collection at
http://xml.fido.gov/stratml/drybridge/index.htm#Other  Indeed, I'd love to
see the goals and objectives of all of the W3C's groups, as well as other
SDOs, rendered in open, standard, machine-readable StratML format ... so the
rest of us could understand what they are trying to accomplish... a novel
concept, don't you think ... sharing objectives more efficiently, in
machine-readable format, with others who may wish to help accomplish them?
http://xml.fido.gov/stratml/drybridge/index.htm#W3C]

I think looking at a generic mechanism that accomplishes what it
accomplishes is fine.  

[If anyone disagrees with you on that point, I don't care to hear from them
... because I don't disagree with you.]

To the extent that one needs to know an application domain to correctly
change an XML document (with or without tracked changes), GCT is not enough.


[Perhaps you are correct, but I will be surprised if the leading XML
authoring/editing tools are incapable of keeping track of changes to valid
XML instance documents.]

That's like saying programming language grammars are not context free, yet
providing context-free grammars for them, along with separately stated
semantic conditions/constraints is of great value.  I think one approach is
to view GCT the same way -- as a context-free treatment that is always
valid, but the semantic constraints that limited it beyond syntactic matters
have to be known to have produced it properly.

[I'm not sure I completely understand your point here.  However, again, I'll
be surprised if the leading XML authoring/editing applications are incapable
of tracking changes to XML instance documents that validate against schemas
like those for the StratML standard, and I personally am not particularly
interested in "generic" (non-valid/non-machine-readable) documentation.  I'd
like to think we might be close to the point of being able to move beyond
such documentation to higher levels of maturity.]

 - Dennis

[deleted]
Received on Monday, 22 June 2015 20:43:29 UTC