W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2014

Re: Data "on" the Web vs Data "in" the Web

From: Steven Adler <adler1@us.ibm.com>
Date: Tue, 25 Mar 2014 14:06:40 -0400
To: "Phil Archer" <phila@w3.org>
Cc: "public-dwbp-wg" <public-dwbp-wg@w3.org>
Message-ID: <OF29058366.1DC43DAE-ON85257CA6.00637CCC@us.ibm.com>
I understand your point of view and you've written a passionate defense of RDF and Linked Data. Its not my intent to challenge your beliefs, but I would like to point out some facts and my interpretation. 

1. According to IDC the RDF market was $10M in 2012 and is estimated to grow to $170M in 2017. That might sound like a lot but in comparison to the overall database market of about $30B its peanuts.

2.  There might be government implementations of RDF but there are no Open Data implementations. 

3.  Using this WG to sell RDF and Linked Data to Open Data customers who are not yet using it is not an Open Standards activity. Its a commercial activity. 

4.  There are many other issues Open Data creates around dq and data comparability that have use cases, documented needs, that need standards. 

All of the Above to me means including the RDF and Linked Data approaches with other approaches that solve problems for current implementations too. 

And. Also. 

Not or. 

Not only. 


----- Original Message -----
From: Phil Archer [phila@w3.org]
Sent: 03/25/2014 03:54 PM GMT
To: Steven Adler
Cc: public-dwbp-wg <public-dwbp-wg@w3.org>
Subject: Re: Data "on" the Web vs Data "in" the Web

Steve, I cannot let this e-mail of yours go unchallenged.

On 25/03/2014 14:57, Steven Adler wrote:
> Those are good comments.  The graph data market is pretty small today,
> with interest pretty evenly split between RDF and Property Graph.

Really? I'm very interested to know about uses of Property Graphs as 
there has been discussion about possibly standardising that at W3C. 
We're having difficulty getting a sufficiently large community of 
members together to do this.

> are some things Graph databases can do, especially in social networking
> examples, that perform much better than traditional databases.
> But no one today is using RDF or Graph Data in any Open Data
> implementation and no one has any plans to do so.

This statement is grossly untrue. Off the top of my head examples of 
public sector use of Linked Data (open or not):

European Environment Agency
The BBC, New York Times etc.
The European Commission
UK Government
Deutsche National Bibliotek
Ordnance Survey
British Geological Survey
The Italian Government
The US DoH

Away from the public sector it's used extensively in finance, health 
care and life sciences are making extensive and growing use of it etc. 
etc. Check out http://semanticweb.com/ for more news of the substantial 
and growing commercial use of the technology.

   Cities and State
> governments have limited budgets and resources and very limited skills.

True of course. The point of this WG is to show those people how to make 
the best of what limited resources they have in this regard. They should 
be able to benefit from the power of LD even if they don't necessarily 
use it as a core tech themselves or even know they're using it.

> And RDF and Graph are not the only ways to skin this cat...

No, but it is the best available method that can operate at Web scale 
and that maximises the benefits of the network effect.

> Does Data Quality need linked data vocabularies to offer value?

URIs as identifiers and LD vocabularies are how you share meaning across 
different datasets. Datasets that use external vocabularies are more 
valuable than ones that just use internal codes that are meaningless out 
of that specific context.

> standardized lineage and certification suffice?

Not sure what you mean by these.

   I can see the value of
> graph search for Data Comparability, but well defined metadata would also
> take Open Data to the next level.

That's what the CSV on the Web WG is about, for example - well defined 
metadata for tabular data that can, among other things, be used to 
generate LD from the table.

   If we only recommend RDF and Linked
> Data as Best Practices for Data Publishing and only a small fraction of
> the market can use them, what good have we done?

See previous. It's about making the benefits of LD, i.e. using the Web 
as a data platform, not just a means for shifting PDFs and CSVs from one 
place to another. We want people to get the most from their efforts to 
bring efficiencies and transparency, as well as supporting the 
commercial knowledge economy.

> I want to make sure that the work we do has maximum impact and so far use
> case evidence does not convince me that RDF and Linked Data alone will get
> us there.

No one is suggesting that it alone will get us there. But everything we 
do should support the goal of making use of the power of the Web. That 
means making sure people use URIs as identifiers, or that they publish 
metadata such that their non-URI identifiers can be rendered as URIs; 
that relationships are defined etc.

The Editor's Draft that Augusto began this thread by pointing to is 
interesting [1]. That talks about how to put links into non-RDF and 
non-HTML formats. +1 to that.


[1] http://tools.ietf.org/html/draft-kelly-json-hal

> From:
> Christophe Guéret <christophe.gueret@dans.knaw.nl>
> To:
> Steven Adler/Somers/IBM@IBMUS
> Cc:
> Christophe Gueret <christophe.gueret@dans.knaw.nl>, Augusto Herrmann
> <augusto.herrmann@gmail.com>, "hellmatic@gmail.com" <hellmatic@gmail.com>,
> public-dwbp-wg <public-dwbp-wg@w3.org>
> Date:
> 03/25/2014 07:31 AM
> Subject:
> Re: Data "on" the Web vs Data "in" the Web
> Hoi Steve,
> Last year Facebook announced its graph search function, choosing the power
> of semantic search without RDF.  What I have learned from this WG
> experience so far is that W3C doesn't really create open standards. It
> creates and enhances and promotes W3C standards.
> I've some difficulties to follow you on that one, aren't W3C standards
> open ?
> The rest of the world often thanks W3C for its ideas and then implements
> those ideas in different ways.
> I thought this was rather common in industry. People copy each other and
> spend time re-branding the same ideas, also probably to go around patents
> while re-using things that are indeed good ideas. E.g. "retina display"
> versus "hd screen", "facetime" VS "hangout" VS "videoconference", "like"
> VS "+1", google's graph VS facebook's graph, etc ...
> Can we, this WG, imagine creating or recommending standards that are
> objective - that describe things to do that anyone can do with or without
> RDF?
> We'll see... :)
> Regards,
> Christophe
> Regards,
> Steve
>    From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
>    Sent: 03/24/2014 04:01 PM CET
>    To: Steven Adler
>    Cc: Christophe Gueret <christophe.gueret@dans.knaw.nl>; Augusto Herrmann
> <augusto.herrmann@gmail.com>; "hellmatic@gmail.com" <hellmatic@gmail.com>;
> public-dwbp-wg <public-dwbp-wg@w3.org>
>    Subject: Re: Data "on" the Web vs Data "in" the Web
> So now we are creating W3C standards for publishing data as unstructured
> text on websites?
> The message of this presentation is actually quite the opposite ;-)
> Instead the idea is to use the Web as platform to host the data. That is,
> instead of publishing datasets as resources use URIs and HTTP to gain
> access to specific (structured !) elements of data sets which can be
> linked and re-used. There has to be a structure and there has to be links
> possibility but this does not mean that RDF is the only model that will
> work out and that RDF/XML is the only way to serialise data.
> Is that what's in the charter?  Honestly I have always found the charter
> to be confusing. Maybe it was intended to be machine readable. ;-
> :-)
> Cheers,
> Christophe
> Regards,
> Steve
>    From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
>    Sent: 03/24/2014 03:42 PM CET
>    To: Steven Adler
>    Cc: Augusto Herrmann <augusto.herrmann@gmail.com>; "hellmatic@gmail.com"
> <hellmatic@gmail.com>; DWBP Public List <public-dwbp-wg@w3.org>
>    Subject: Re: Data "on" the Web vs Data "in" the Web
> Hoi,
> I think this (semantic !) discussion around data "on" and "in" can be a
> good way to let people see the difference being putting a link to a
> resource which is a data set dump ("on") and providing some kind of API
> ("in") - whatever the technologies of the API are. Lately, I've been using
> that argument to point people to the fact that downloading dumps of data
> in various forms is like doing document sharing prior to the Web. Coming
> them to the conclusion that we should publish our data as Web sites. There
> is a bit of a focus set on SemWeb technologies for that but, really, we
> could think of many other ways to reach the same result. Here are the
> slides, comments are most welcome ;-) :
> http://www.slideshare.net/cgueret/linking-knowledge-spaces
> Cheers,
> Christophe
> On 20 March 2014 15:17, Steven Adler <adler1@us.ibm.com> wrote:
> Augusto,
> I am interested in learning about HAL and look forward to this discussion.
>   But I am a bit concerned with the way you phrase these sentences:
> "There should be a way to at first publish open data resources that are
> linked, but without rdf, such as in xml and json. Then, at a later date,
> improve with a descriptive rdf vocabulary and expressed in rdf to become
> linked open data (preferrably, if possible, keeping compatibility with
> clients that implemented reading the previous non-semantic version)."
> To me this reads that non-rdf methods like xml and json are accommodations
> to constituents who "have not yet seen the light of RDF" and I want to
> make sure we are providing best practices standards recommendations to the
> world that exists rather than the "perfect world" we would like someday to
> exist.
> At IBM, we make software that runs on many operating systems.  Of course
> we employ people with preferences for OSX, Linux, Systemz, AIX, Unix, and
> even Windows.  Heck, many ATMS around the world still run on OS/2...
> But because our customers run all of the above we supply them with all of
> the above solutions.
> Can we agree on an "all of the above" approach to DWBP (without suggesting
> that everything someday becomes RDF) too?
> Best Regards,
> Steve
> Motto: "Do First, Think, Do it Again"
> From:
> Augusto Herrmann <augusto.herrmann@gmail.com>
> To:
> DWBP Public List <public-dwbp-wg@w3.org>
> Date:
> 03/19/2014 01:16 PM
> Subject:
> Re: Data "on" the Web vs Data "in" the Web
> Hi,
> this is a very important point, Ig. My thoughts exactly when I suggested
> we look at the Hypertext Application Language (HAL) proposal [1] in the
> first meeting. It was in fact an invitation for us to think about data
> "in" the web, as in "part of the web itself". We don't necessarily have to
> follow HAL, but should look at is as a source of inspiration. The way
> links are represented in resources in Subbu Allamaraju's RESTful
> Webservices Cookbook [2] is another source of inspiration.
> We should think of standard ways to insert links to other data into many
> common open data formats, such as xml, json and maybe even csv.. Of course
> this linking requirement is satisfied by linked open data and rdf, but
> sometimes organizations have some data and are willing to pubilsh, but
> initially do not have the necessary resources (i.e. people, knowledge) to
> develop vocabularies to describe the data. However, interlinking among
> resources of a dataset, or even linking to resources in other datasets is
> somewhat easier to do. There should be a way to at first publish open data
> resources that are linked, but without rdf, such as in xml and json. Then,
> at a later date, improve with a descriptive rdf vocabulary and expressed
> in rdf to become linked open data (preferrably, if possible, keeping
> compatibility with clients that implemented reading the previous
> non-semantic version).
> Perhaps this could become a use case for the Best Practices document.
> [1] http://tools.ietf.org/html/draft-kelly-json-hal
> [2] http://books.google.com.br/books?id=LDuzpQlVuG4C
> All the best,
> Augusto Herrmann
> Open Data Team - Ministry of Planning - Brazil
> On Mon, Mar 17, 2014 at 9:28 AM, Ig Ibert Bittencourt <ig.ibert@gmail.com>
> wrote:
> Hello DWBP,
> I was reading again about the 5 Start for Open Data and I saw this
> affirmation below about 3 starts Web Data [1] that I think would be
> interesting to share with this WG.
> Excellent! The data is not only available via the Web but now everyone can
> use the data easily. On the other hand, it's still data on the Web and not
> data in the Web.
> With regards this affirmation, you can see more details in [2] and [3],
> but not that much.
> [1] http://5stardata.info/
> [2] http://webofdata.wordpress.com/2010/03/01/data-and-the-web-choices/
> [3] http://lists.xml.org/archives/xml-dev/200211/msg01290.html
> Best,
> Ig Ibert Bittencourt
> Professor Adjunto III - Universidade Federal de Alagoas (UFAL)
> Vice-Coordenador da Comissão Especial de Informática na Educação
> Líder do Centro de Excelência em Tecnologias Sociais
> Co-fundador da Startup MeuTutor Soluções Educacionais LTDA.


Phil Archer
W3C Data Activity Lead

+44 (0)7887 767755
Received on Tuesday, 25 March 2014 18:07:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:12 UTC