W3C home > Mailing lists > Public > public-egov-ig@w3.org > May 2009

OGD strategies: scenarios and design patterns [was: Re: charter and publication wrt W3C Process]

From: Jose M. Alonso <josema@w3.org>
Date: Sun, 31 May 2009 23:45:58 +0200
Cc: Joe Carmel <joe.carmel@comcast.net>, "Tumin, Zachary" <Zachary_Tumin@hks.harvard.edu>, Owen Ambur <Owen.Ambur@verizon.net>, eGov IG <public-egov-ig@w3.org>, Kevin Novak <kevinnovak@aia.org>, John Sheridan <John.Sheridan@nationalarchives.gsi.gov.uk>
Message-Id: <4DC82CE8-9203-4CB6-8685-970CC551A9EA@w3.org>
To: Pito Salas <rps@salas.com>
All, thanks for the interesting discussion so far. Some hints on how  
to publish OGD are in the Note itself, namely at:
http://www.w3.org/TR/2009/NOTE-egov-improving-20090512/#OGD.how

I'm not suggesting an ANSWER yet, but here are some more thoughts.

I don't think we need yet another format/protocol. I think some SW  
technologies properly combined can achieve the same effect. In fact,  
concepts such as "endpoint" already exist in SPARQL [1], and I think  
that I prefer to use standards HTTP response codes, such as in Cool  
URIs for the Semantic Web [2].

I personally think that repository schemas can fill a gap anyway  
(although I still need to learn more about all work on URI schemes out  
there)

Let me explain how I see it...

John mentioned a while ago the concept of "design patterns" and I  
think we also need to identify scenarios.

For me there are two very high level scenarios: document-centric and  
database-centric.

Then, there's a need to decide how to go from starting point A  
(original data) to ending point B (published dataset).

An scenario can be 20,000 documents in Word format or 350M in PDF or  
75,000 in XML or 15,000 records in a single table in a MS Access DB,  
just to name a few.

A design pattern would be a method that would allow one to go from one  
of those scenarios to a published (linkable) dataset. Maybe we could  
also talk about publishing scenarios here: CSV file, XML docs, RDF  
triples, a REST service?

More specific example: how can I publish those 15,000 records in a  
single MS Access DB in RDF making an SPARQL endpoint available to add  
them to the LOD cloud?

The answer could be to use something like the upcoming RDB2RDF, use  
exiting vocabularies as possible to describe the data or create a  
local ones if not, etc.

This one, when made generic enough would make a design pattern  
applicable to other cases.

So, I believe there is no single ANSWER, and that most of the effort  
of the IG should be spent on identifying the scenarios and building  
the design patterns. This might well be the ANSWER itself with  
applicability everywhere. Whether this should be normative or not, I  
don't know. I don't think it matters much as far as they are useful  
for people.
We could release this one by one, such as the TAG findings [3].

We could also add to the mix some piloting, some "learning by example"  
as some suggested already, i.e. a bottom-up approach combined with the  
more generic top-down one.

Doing this alone would be huge; I'm sure you all can envision already  
the amount of work it would be needed.

And I presume we will need to address more than OGD in the 2nd  
charter, and there's the 2nd version of the note some mentioned... too  
many interesting things to pick up from? ;)

-- Jose

[1] http://www.w3.org/TR/rdf-sparql-query/
[2] http://www.w3.org/TR/cooluris/
[3] http://www.w3.org/2001/tag/findings


El 27/05/2009, a las 15:22, Pito Salas escribió:
> Hi
>
> Interesting thread; I will sign up to the list.. In the meanwhile:
>
> 1) I am no longer calling my this thing "datarss" as that has created
> more confusion than clarity. I haven't updated the documents yet, but
> I think a better working name for this is "decentralized data
> discovery"
> 2) Here is a 'worked' example that might be of interest:
> http://www.scribd.com/doc/14136777/DataRss-Tech-Overview. Forgive that
> the examples are in YAML but its a simple to translate that to XML of
> course.
> 3) I also am not saying that this is THE ANSWER, but suggests a  
> direction.
> 4) I submitted this as an idea on the opengov list:
> http://opengov.ideascale.com/akira/dtd/3219-4049
>
> I am continuing to work on DDD as a side project. If anyone is
> interested in learning more, comparing notes, or helping...?
>
> On Wed, May 27, 2009 at 8:32 AM, Joe Carmel <joe.carmel@comcast.net>  
> wrote:
>> As an experiment and first step, I've converted the non well-formed  
>> HTML at
>> data.gov (http://www.data.gov/catalog/category/0/agency/0/filter//type#raw 
>> )
>> to a well-formed data-centric XML file
>> (http://www.xmldatasets.net/data.gov/catalog.xml).
>>
>> This is pretty much the simplest XML version of the data I could  
>> come up
>> with and I'm sure different people will view this version both  
>> positively
>> (it is well formed and can be accessed with various tools) and  
>> negatively
>> (it doesn't go far enough).  I haven't "pulled in" the "details"  
>> data for
>> each dataset (e.g., http://www.data.gov/details/16) which includes  
>> category,
>> date released, etc. which could be used with faceted browsing, etc.  
>> but I
>> wanted to take a first stab at the data, get the conversation  
>> going, and
>> hopefully provide a file that others can use/manipulate/extend to  
>> provide
>> additional examples of best practice ideas.
>>
>> I definitely do not have the ANSWER nor am I proposing that the XML  
>> file I
>> created is the ANSWER but I think we likely need examples in order to
>> discuss the pros and cons of various approach and I thought having a
>> well-formed version of the data to create additional examples would  
>> make it
>> easier for other developers.
>>
>> Here are a couple of interesting links for further ideas.
>>
>> http://www.epsiplus.net/events/thematic_meetings/information_standards/stand
>> ards_meeting_3/information_asset_registers_opsi_discussion_paper
>>
>> http://www.salas.com/2009/04/13/geeky-how-datarss-might-work/
>>
>> Thanks,
>>
>> Joe
>>
>>
>>
>> -----Original Message-----
>> From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org 
>> ]
>> On Behalf Of Tumin, Zachary
>> Sent: Wednesday, May 27, 2009 7:59 AM
>> To: Jose M. Alonso; Owen Ambur
>> Cc: 'eGov IG'
>> Subject: RE: charter and publication wrt W3C Process
>>
>> For a standards discussion, this is riveting. As an IE I have had  
>> my eyes
>> opened to any number of important issues, not the least of which the
>> challenge of making data.gov truly "open", what that means, and  
>> requires
>> still.
>>
>> Many thanks -
>>
>> Zach
>>
>> ============================================
>> Zachary Tumin
>> Executive Director
>> Leadership for a Networked World Program http://www.lnwprogram.org/
>> John F. Kennedy School of Government | Harvard University
>> 79 John F. Kennedy Street | Cambridge, MA |02138
>> voice: 617-495-3036 | fax: 617-495-8228 |
>>
>>
>> -----Original Message-----
>> From: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org 
>> ]
>> On Behalf Of Jose M. Alonso
>> Sent: Wednesday, May 27, 2009 7:00 AM
>> To: Owen Ambur
>> Cc: 'eGov IG'
>> Subject: Re: charter and publication wrt W3C Process
>>
>> El 20/05/2009, a las 16:10, Owen Ambur escribió:
>>> While I wouldn't exactly call it a "small" document, I agree that  
>>> the
>>> Web Accessibility Initiative's (WAI) Accessible Rich Internet
>>> Applications
>>> (ARIA) best practices are a good example of the kind of deliverable
>>> the eGov IG could produce that might actually be useful to
>>> stakeholders who are capable of using it.
>>> http://www.w3.org/TR/wai-aria-practices/#accessiblewidget
>>
>> Good example. I'm not sure we need to go down to code, but would  
>> love us to
>> produce some of those "recipes" to help me go from Point A to Point  
>> B in a
>> OGD project.
>>
>>
>>> I also agree that a good topic of focus for the eGov IG would be  
>>> open
>>> government data (OGD), such as:
>>>
>>> a) how agencies can make their data more readily discoverable and
>>> usable, and
>>>
>>> b) in turn, how stakeholders (including intermediary service
>>> providers) can
>>> measure and assess the degrees to which agencies have done so
>>> (recognizing that perfection is not the goal and progress generally
>>> occurs in many small steps).
>>>
>>> In the U.S. federal government, the Federal Enterprise Architecture
>>> (FEA)
>>> Data Reference Model (DRM) was supposed to serve that function.
>>> http://en.wikipedia.org/wiki/Federal_Enterprise_Architecture#Data_Refe
>>> rence_
>>> Model_.28DRM.29  However, since agency DRM's themselves are not
>>> readily discoverable and usable, the FEA DRM as currently being
>>> "practiced"
>>> cannot
>>> possibly serve the function for which it was intended, at least not
>>> for external stakeholders (e.g., citizens).
>>>
>>> The draft XSD for the DRM, which would have made the DRM data itself
>>> "open"
>>> but was not finalized and implemented, is available at
>>> http://xml.gov/draft/drm20060105.xsd
>>>
>>> Other ways of viewing this potential initiative for the eGov IG are
>>> as:
>>>
>>> 1) an internationalized set of best practices for implementing
>>> President Obama's directive on transparency and open government,  
>>> which
>>> is available in StratML format at http://xml.gov/stratml/DTOG.xml,  
>>> and
>>>
>>> 2) providing practical proposals for prospective implementation in
>>> services like http://data.gov/
>>
>> I like this.
>>
>>> Of course, too, I believe it would be good to explicitly identify  
>>> our
>>> stakeholders -- both performers (who are volunteering to do the
>>> work) as
>>> well as prospective beneficiaries, whom we should try to engage in
>>> providing feedback as well as eventually *using* our deliverable(s).
>>
>> +1
>>
>> -- Jose
>>
>>
>>>  Ideally, we
>>> would identify our stakeholders (together with our goals and
>>> objectives) in
>>> a readily shareable format like StratML and, thus, practice what we
>>> preach while demonstrating leadership by example.
>>>
>>> Owen
>>>
>>> -----Original Message-----
>>> From: public-egov-ig-request@w3.org
>>> [mailto:public-egov-ig-request@w3.org
>>> ]
>>> On Behalf Of Jose M. Alonso
>>> Sent: Wednesday, May 20, 2009 7:50 AM
>>> To: Sharron Rush
>>> Cc: eGov IG
>>> Subject: Re: charter and publication wrt W3C Process
>>>
>>>> ...
>>>>> + a set of small docs with guidance?
>>>>>  (could be recs or not)
>>>>
>>>> I am not sure what these "small docs" would do that would not be
>>>> included in BP and the rewritten Note, but am open to suggestion.
>>>> Are you thinking of technical documents that would be more of a  
>>>> how-
>>>> to?  a series of case studies of particularly effective practices?
>>>
>>> I was thinking of small how-to like things, e.g. techniques to
>>> identify and expose OGD, but also identification of scenarios to do
>>> so. More how-to than case studies.
>>>
>>>> The suite of ARIA documents could be a model, I suppose.
>>>
>>> Maybe... I like this how-to piece:
>>> http://www.w3.org/TR/wai-aria-practices/#accessiblewidget
>>>
>>>> This one requires more consideration and could be decided after  
>>>> being
>>>> chartered, is that not so?  or do we need to state our entire scope
>>>> of work at the time of charter?
>>>
>>> As specific as possible is always welcome, but we can definitely  
>>> leave
>>> some room as we did first time. More on charters:
>>> http://www.w3.org/2005/10/Process-20051014/groups#WGCharter
>>>
>>>
>>>>> + a second version of the Note?
>>>>>  (no need to be a rec, as you know)
>>>>
>>>> Yes, the Note must be rewritten for coherence, narrative flow,
>>>> conclusions, etc.
>>>
>>> Heard several saying this. I don't have an opinion yet besides that
>>> this should be done if there are group members willing to take on  
>>> this
>>> task.
>>>
>>>
>>>>> In summary: going normative is "stronger" but has more  
>>>>> implications:
>>>>> patent policy matters, strongest coordination with other groups,
>>>>> more process-related stuff to deal with...
>>>>
>>>> If we are saying that we will produce normative standards and  
>>>> expect
>>>> eGov practitioners around the world to begin to claim "conformance"
>>>> to these standards,  that is a mighty undertaking.  Think of the
>>>> arduous processes around WCAG2 and HTML5.  Also, eGov is a bit less
>>>> easily defined because of cultural influences, history, forms of
>>>> government etc.  I would advise that we not commit to normative
>>>> output at this time, but as previously stated, happy to hear  
>>>> another
>>>> point of view.
>>>
>>> Ok, thanks. I think I'm more of a non-normative opinion so far.
>>>
>>>
>>>> Please let me know if this is the type of input needed and/or if I
>>>> have overlooked any questions.
>>>
>>> Very much so, thanks!
>>> If you have something more specific in mind about the content we
>>> should produce, please share it, too.
>>>
>>> Cheers,
>>> Jose.
>>>
>>>
>>>> Thanks,
>>>> Sharron
>>>>
>>>>> [1] http://www.w3.org/Consortium/Process/
>>>>> [2] http://www.w3.org/2005/10/Process-20051014/groups#GAGeneral
>>>>> [3] http://www.w3.org/2008/02/eGov/ig-charter
>>>>> [4] http://www.w3.org/2004/02/05-patentsummary
>>>>> [5] http://www.w3.org/2005/02/AboutW3CSlides/images/groupProcess.png
>>>>> [6] http://www.w3.org/2005/10/Process-20051014/tr#Reports
>>>>> [7] http://www.w3.org/Guide/Charter
>>>>> [8] http://www.w3.org/TR/mobile-bp/
>>>>>
>>>>> --
>>>>> Jose M. Alonso <josema@w3.org>    W3C/CTIC
>>>>> eGovernment Lead                  http://www.w3.org/2007/eGov/
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
Received on Sunday, 31 May 2009 21:47:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 31 May 2009 21:47:35 GMT