W3C home > Mailing lists > Public > public-egov-ig@w3.org > February 2010

Linked Data Dog-Catcher Scenario

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 25 Feb 2010 23:07:48 -0500
To: "Cory Casanave" <cory-c@modeldriven.com>
cc: "W3C e-Gov IG" <public-egov-ig@w3.org>
Message-ID: <4978.1267157268@waldron>

> We have certainly run into almost all of these questions.  

Should we use them as a starting point for a FAQ and see where it goes?

> One of the big ones I run into is the confusion over RDF Vs schema
> based XML - as these both come from W3C and can be used for
> overlapping purposes, users seem unclear where there are two standard
> "stacks", complete with their own schema, query language, etc.  After
> all, many have only just warmed up to XML!

Indeed.  I think I can share the fact that the W3C staff has had this
...  debate...  from time to time.  Each stack (XML and RDF) has a
large, growing, dedicated set of users.  We can't just abandon either
technology.  Some of us spent a two day retreat once (summer of '08)
trying to come up with a bit of guidance to users, but we could not
reach any useful agreement.  It was very disappointing, but perhaps not
surprising.

There are some technical bridges: GRDDL, and RDF-influenced styles of
writing XML, that let you use both technology stacks from the same data,
but I've seen no interest from industry in supporting this kind of
standards work.  I've invested time, as have other individuals from time
to time, but it never seems to get much past the hobby level.

> The position we are taking is that RDF is the web data model (period) -
> publish data in other ways, but always have RDF as the normal form so
> tools can interoperate across data sets without a lot of unnecessary
> complexity.

"We" is modeldriven.com, yes?  Yeah, I think that's an excellent
approach.

> But, it then has to be very clear how this data is identified, linked,
> queried, browsed, etc.  The provenance has to be rock-solid.  As some of
> these things are not fully cooked, it makes adoption hard - the
> supporting community needs to complete this picture.  Perhaps we just
> need to say - for gov data, this is how it should be done and provide
> some open implementations that do it that way.

I wonder if we can identify a few use cases for linked data and write
down one or more recipes for satisfying each, maybe even with
testimonials from folks who have used the given solutions....

For example (this is rough, off the top of my head, and it's been a long
day):

  USE CASE: Allow for easy public reuse of data gathered and
  created in authoring reports.

  SCENARIO: Every year, Bob's department publishes a report of all the
  incidents involving stray animals in South Buckyvania.  The various
  police and animal control departments and animal shelters send him
  reports, sometimes on paper, sometimes in spreadsheets.  They're
  supposed to send him data every quarter, but in practice, near the end
  of year, he has to chase a lot of them down.  Sometimes he is not able
  to get data for some years.  Sometimes he settles for them telling him
  a few numbers over the phone.  When he gets what data he can, he puts
  it all in his own spreadsheet, crunches numbers for a while, and
  produces a 10-page summary for the Ministery of Education, which (for
  historical reasons) provides most of the funding for animal control.

  One day, Bob's manager tells him they need to open up his data, and
  publish it in RDF.  Bob is, at first, quite reluctant.  He doesn't
  want to embarass the shelters who are too busy rescuing animals to
  send him their data on time.  He is concerned that the occasion
  sightings of wolves, detailed in his reports, will be use to drum up
  hysteria.  (He knows they pose less danger than rabid squirrels, but
  he's talked to enough people to know some folks wont see it that way.)
  But most of all, he has no idea what RDF is, and really has no
  interest in learning about it.  He loves animals, not computers.

  SOLUTION 1: Purchase a product which published RDF from his
  spreadsheet software.  

  Known software for this solution:
     Anzo, from http://www.cambridgesemantics.com/

  [These products have not been evaluated for this purpose; they have
  merely been suggested by someone.  If you know of software which is
  applicable here, please contact some_list@w3.org.]

  @@@ fill in details about how discovers or develops ontologies

  @@@ fill in something about his non-technical concerns

  @@@ discuss the potential wins; eventually, maybe the shelters can
  update some system in real time, users can be notified if a stray
  animal is seen matching the kind of animal they have, or want.
  Statistics about breeds and trends across larger areas can be
  gathereg, etc, etc.

> The other side is non-technical, governments need to STOP ACCEPTING
> DATA in non-standard and unstructured formats.  If the FAR required
> all data to be delivered in RDF, publishing it would be much less of
> an issue.  My 2c.

Oh, great, now we're going to have politicians running on a platform of
which technology they'll force everyone to use.  :-) :-)

     -- Sandro


> -Cory
> 
> -----Original Message-----
> From: Sandro Hawke [mailto:sandro@w3.org]=20
> Sent: Wednesday, February 24, 2010 10:04 PM
> To: Cory Casanave
> Cc: W3C e-Gov IG
> Subject: Re: [LD-Outreach] Meeting Reminder
> 
> 
> > The phone meeting for LD-Outreach will be Thursday @ 10AM EDT.
> >=20
> > Topic... : what are the pressing technical issues for government
> > linked data, and what guidance can we provide?
> 
> Seriously.  What do you need to know?  What do you think others need to
> know?   Especially, what can't be found from existing sources?
> 
> If you can't come to the meeting (and even if you can), spend a few
> minutes and send e-mail right now.
> 
> Some strawman ideas that come into my head:
> 
>  - Which tools can I rely on to build my systems?  Which are production
>    quality and here for the long term?
>  - Do we really need to understand the RDF Semantics?
>  - What about the RDF/XML syntax.... Can we just use Turtle?
>  - Can we use XSLT?
>  - Do we have to use OWL?
>  - Does SPARQL scale?
>  - What are quads good for?
>  - Is there a good way to get RDF out of our SQL database?
>  - What should our publication URLs look like?
>  - How do make sure those URLs will be around, long term?
>  - Who will mint identifiers for things like other agencies, or
>    geographic locations, which we need to refer to in our data?
>  - How do we represent numbers with units (physical measures, dollar
>    amount)
> 
> etc, etc.  :-)
> 
> I can go on like this forever, but I don't know which questions actually
> matter to the folks doing this for a living.  And many of these
> questions -- maybe all of them -- are in no way government specific, so
> they're probably out of scope for us.
> 
>    -- Sandro
> 
> 
> 
> 
Received on Friday, 26 February 2010 04:07:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 26 February 2010 04:07:52 GMT