Re: Data "on" the Web vs Data "in" the Web from Bart van Leeuwen on 2014-03-25 (public-dwbp-wg@w3.org from March 2014)

From: Bart van Leeuwen <bart_van_leeuwen@netage.nl>
Date: Tue, 25 Mar 2014 17:17:31 +0100
To: public-dwbp-wg <public-dwbp-wg@w3.org>
Message-ID: <OF395E2569.F8EEABE2-ONC1257CA6.00597939-C1257CA6.00597EBF@netage.nl>
+1

Phil Archer <phila@w3.org> wrote on 25-03-2014 16:54:31:

> From: Phil Archer <phila@w3.org>
> To: Steven Adler <adler1@us.ibm.com>
> Cc: public-dwbp-wg <public-dwbp-wg@w3.org>
> Date: 25-03-2014 16:54
> Subject: Re: Data "on" the Web vs Data "in" the Web
> 
> Steve, I cannot let this e-mail of yours go unchallenged.
> 
> On 25/03/2014 14:57, Steven Adler wrote:
> > Those are good comments.  The graph data market is pretty small today,
> > with interest pretty evenly split between RDF and Property Graph.
> 
> Really? I'm very interested to know about uses of Property Graphs as 
> there has been discussion about possibly standardising that at W3C. 
> We're having difficulty getting a sufficiently large community of 
> members together to do this.
> 
>    There
> > are some things Graph databases can do, especially in social 
networking
> > examples, that perform much better than traditional databases.
> >
> > But no one today is using RDF or Graph Data in any Open Data
> > implementation and no one has any plans to do so.
> 
> 
> This statement is grossly untrue. Off the top of my head examples of 
> public sector use of Linked Data (open or not):
> 
> UN FAO
> European Environment Agency
> The BBC, New York Times etc.
> The European Commission
> UK Government
> Deutsche National Bibliotek
> Ordnance Survey
> British Geological Survey
> The Italian Government
> The US EPA
> The US DoH
> 
> Away from the public sector it's used extensively in finance, health 
> care and life sciences are making extensive and growing use of it etc. 
> etc. Check out http://semanticweb.com/ for more news of the substantial 
> and growing commercial use of the technology.
> 
>    Cities and State
> > governments have limited budgets and resources and very limited 
skills.
> 
> True of course. The point of this WG is to show those people how to make 

> the best of what limited resources they have in this regard. They should 

> be able to benefit from the power of LD even if they don't necessarily 
> use it as a core tech themselves or even know they're using it.
> 
> >
> > And RDF and Graph are not the only ways to skin this cat...
> 
> No, but it is the best available method that can operate at Web scale 
> and that maximises the benefits of the network effect.
> 
> >
> > Does Data Quality need linked data vocabularies to offer value?
> 
> URIs as identifiers and LD vocabularies are how you share meaning across 

> different datasets. Datasets that use external vocabularies are more 
> valuable than ones that just use internal codes that are meaningless out 

> of that specific context.
> 
>   Wouldn't
> > standardized lineage and certification suffice?
> 
> Not sure what you mean by these.
> 
>    I can see the value of
> > graph search for Data Comparability, but well defined metadata would 
also
> > take Open Data to the next level.
> 
> That's what the CSV on the Web WG is about, for example - well defined 
> metadata for tabular data that can, among other things, be used to 
> generate LD from the table.
> 
> 
>    If we only recommend RDF and Linked
> > Data as Best Practices for Data Publishing and only a small fraction 
of
> > the market can use them, what good have we done?
> 
> See previous. It's about making the benefits of LD, i.e. using the Web 
> as a data platform, not just a means for shifting PDFs and CSVs from one 

> place to another. We want people to get the most from their efforts to 
> bring efficiencies and transparency, as well as supporting the 
> commercial knowledge economy.
> 
> >
> > I want to make sure that the work we do has maximum impact and so far 
use
> > case evidence does not convince me that RDF and Linked Data alone will 
get
> > us there.
> 
> No one is suggesting that it alone will get us there. But everything we 
> do should support the goal of making use of the power of the Web. That 
> means making sure people use URIs as identifiers, or that they publish 
> metadata such that their non-URI identifiers can be rendered as URIs; 
> that relationships are defined etc.
> 
> The Editor's Draft that Augusto began this thread by pointing to is 
> interesting [1]. That talks about how to put links into non-RDF and 
> non-HTML formats. +1 to that.
> 
> Phil.
> 
> [1] http://tools.ietf.org/html/draft-kelly-json-hal
> 
> 
> >
> >
> > From:
> > Christophe Guéret <christophe.gueret@dans.knaw.nl>
> > To:
> > Steven Adler/Somers/IBM@IBMUS
> > Cc:
> > Christophe Gueret <christophe.gueret@dans.knaw.nl>, Augusto Herrmann
> > <augusto.herrmann@gmail.com>, "hellmatic@gmail.com" 
<hellmatic@gmail.com>,
> > public-dwbp-wg <public-dwbp-wg@w3.org>
> > Date:
> > 03/25/2014 07:31 AM
> > Subject:
> > Re: Data "on" the Web vs Data "in" the Web
> >
> >
> >
> > Hoi Steve,
> > Last year Facebook announced its graph search function, choosing the 
power
> > of semantic search without RDF.  What I have learned from this WG
> > experience so far is that W3C doesn't really create open standards. It
> > creates and enhances and promotes W3C standards.
> > I've some difficulties to follow you on that one, aren't W3C standards
> > open ?
> > The rest of the world often thanks W3C for its ideas and then 
implements
> > those ideas in different ways.
> > I thought this was rather common in industry. People copy each other 
and
> > spend time re-branding the same ideas, also probably to go around 
patents
> > while re-using things that are indeed good ideas. E.g. "retina 
display"
> > versus "hd screen", "facetime" VS "hangout" VS "videoconference", 
"like"
> > VS "+1", google's graph VS facebook's graph, etc ...
> > Can we, this WG, imagine creating or recommending standards that are
> > objective - that describe things to do that anyone can do with or 
without
> > RDF?
> > We'll see... :)
> >
> > Regards,
> > Christophe
> >
> > Regards,
> >
> > Steve
> >
> >    From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
> >    Sent: 03/24/2014 04:01 PM CET
> >    To: Steven Adler
> >    Cc: Christophe Gueret <christophe.gueret@dans.knaw.nl>; Augusto 
Herrmann
> > <augusto.herrmann@gmail.com>; "hellmatic@gmail.com" 
<hellmatic@gmail.com>;
> > public-dwbp-wg <public-dwbp-wg@w3.org>
> >
> >    Subject: Re: Data "on" the Web vs Data "in" the Web
> >
> >
> > So now we are creating W3C standards for publishing data as 
unstructured
> > text on websites?
> > The message of this presentation is actually quite the opposite ;-)
> > Instead the idea is to use the Web as platform to host the data. That 
is,
> > instead of publishing datasets as resources use URIs and HTTP to gain
> > access to specific (structured !) elements of data sets which can be
> > linked and re-used. There has to be a structure and there has to be 
links
> > possibility but this does not mean that RDF is the only model that 
will
> > work out and that RDF/XML is the only way to serialise data.
> >
> > Is that what's in the charter?  Honestly I have always found the 
charter
> > to be confusing. Maybe it was intended to be machine readable. ;-
> >
> > :-)
> >
> > Cheers,
> > Christophe
> >
> >
> >
> > Regards,
> >
> > Steve
> >
> >    From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
> >    Sent: 03/24/2014 03:42 PM CET
> >    To: Steven Adler
> >    Cc: Augusto Herrmann <augusto.herrmann@gmail.com>; 
"hellmatic@gmail.com"
> > <hellmatic@gmail.com>; DWBP Public List <public-dwbp-wg@w3.org>
> >
> >    Subject: Re: Data "on" the Web vs Data "in" the Web
> >
> > Hoi,
> >
> > I think this (semantic !) discussion around data "on" and "in" can be 
a
> > good way to let people see the difference being putting a link to a
> > resource which is a data set dump ("on") and providing some kind of 
API
> > ("in") - whatever the technologies of the API are. Lately, I've been 
using
> > that argument to point people to the fact that downloading dumps of 
data
> > in various forms is like doing document sharing prior to the Web. 
Coming
> > them to the conclusion that we should publish our data as Web sites. 
There
> > is a bit of a focus set on SemWeb technologies for that but, really, 
we
> > could think of many other ways to reach the same result. Here are the
> > slides, comments are most welcome ;-) :
> > http://www.slideshare.net/cgueret/linking-knowledge-spaces
> >
> > Cheers,
> > Christophe
> >
> >
> >
> > On 20 March 2014 15:17, Steven Adler <adler1@us.ibm.com> wrote:
> > Augusto,
> >
> > I am interested in learning about HAL and look forward to this 
discussion.
> >   But I am a bit concerned with the way you phrase these sentences:
> >
> > "There should be a way to at first publish open data resources that 
are
> > linked, but without rdf, such as in xml and json. Then, at a later 
date,
> > improve with a descriptive rdf vocabulary and expressed in rdf to 
become
> > linked open data (preferrably, if possible, keeping compatibility with
> > clients that implemented reading the previous non-semantic version)."
> >
> > To me this reads that non-rdf methods like xml and json are 
accommodations
> > to constituents who "have not yet seen the light of RDF" and I want to
> > make sure we are providing best practices standards recommendations to 
the
> > world that exists rather than the "perfect world" we would like 
someday to
> > exist.
> >
> > At IBM, we make software that runs on many operating systems.  Of 
course
> > we employ people with preferences for OSX, Linux, Systemz, AIX, Unix, 
and
> > even Windows.  Heck, many ATMS around the world still run on OS/2...
> >
> > But because our customers run all of the above we supply them with all 
of
> > the above solutions.
> >
> > Can we agree on an "all of the above" approach to DWBP (without 
suggesting
> > that everything someday becomes RDF) too?
> >
> > Best Regards,
> >
> > Steve
> >
> > Motto: "Do First, Think, Do it Again"
> >
> >
> > From:
> > Augusto Herrmann <augusto.herrmann@gmail.com>
> > To:
> > DWBP Public List <public-dwbp-wg@w3.org>
> > Date:
> > 03/19/2014 01:16 PM
> > Subject:
> > Re: Data "on" the Web vs Data "in" the Web
> >
> >
> >
> >
> >
> > Hi,
> >
> > this is a very important point, Ig. My thoughts exactly when I 
suggested
> > we look at the Hypertext Application Language (HAL) proposal [1] in 
the
> > first meeting. It was in fact an invitation for us to think about data
> > "in" the web, as in "part of the web itself". We don't necessarily 
have to
> > follow HAL, but should look at is as a source of inspiration. The way
> > links are represented in resources in Subbu Allamaraju's RESTful
> > Webservices Cookbook [2] is another source of inspiration.
> >
> > We should think of standard ways to insert links to other data into 
many
> > common open data formats, such as xml, json and maybe even csv.. Of 
course
> > this linking requirement is satisfied by linked open data and rdf, but
> > sometimes organizations have some data and are willing to pubilsh, but
> > initially do not have the necessary resources (i.e. people, knowledge) 
to
> > develop vocabularies to describe the data. However, interlinking among
> > resources of a dataset, or even linking to resources in other datasets 
is
> > somewhat easier to do. There should be a way to at first publish open 
data
> > resources that are linked, but without rdf, such as in xml and json. 
Then,
> > at a later date, improve with a descriptive rdf vocabulary and 
expressed
> > in rdf to become linked open data (preferrably, if possible, keeping
> > compatibility with clients that implemented reading the previous
> > non-semantic version).
> >
> > Perhaps this could become a use case for the Best Practices document.
> >
> > [1] http://tools.ietf.org/html/draft-kelly-json-hal
> > [2] http://books.google.com.br/books?id=LDuzpQlVuG4C
> >
> > All the best,
> > Augusto Herrmann
> > Open Data Team - Ministry of Planning - Brazil
> >
> >
> > On Mon, Mar 17, 2014 at 9:28 AM, Ig Ibert Bittencourt 
<ig.ibert@gmail.com>
> > wrote:
> > Hello DWBP,
> >
> > I was reading again about the 5 Start for Open Data and I saw this
> > affirmation below about 3 starts Web Data [1] that I think would be
> > interesting to share with this WG.
> >
> > Excellent! The data is not only available via the Web but now everyone 
can
> > use the data easily. On the other hand, it's still data on the Web and 
not
> > data in the Web.
> >
> >
> > With regards this affirmation, you can see more details in [2] and 
[3],
> > but not that much.
> >
> >
> > [1] http://5stardata.info/
> > [2] 
http://webofdata.wordpress.com/2010/03/01/data-and-the-web-choices/
> > [3] http://lists.xml.org/archives/xml-dev/200211/msg01290.html
> >
> >
> > Best,
> >
> > Ig Ibert Bittencourt
> > Professor Adjunto III - Universidade Federal de Alagoas (UFAL)
> > Vice-Coordenador da Comissão Especial de Informática na Educação
> > Líder do Centro de Excelência em Tecnologias Sociais
> > Co-fundador da Startup MeuTutor Soluções Educacionais LTDA.
> >
> >
> >
> >
> >
> 
> -- 
> 
> 
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
> 
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
>
Received on Tuesday, 25 March 2014 16:18:04 UTC