Re: Data "on" the Web vs Data "in" the Web from Timothy Lebo on 2014-03-25 (public-dwbp-wg@w3.org from March 2014)

From: Timothy Lebo <lebot@rpi.edu>
Date: Tue, 25 Mar 2014 14:18:32 -0400
To: Steven Adler <adler1@us.ibm.com>
Cc: Phil Archer <phila@w3.org>, public-dwbp-wg <public-dwbp-wg@w3.org>
Message-Id: <132B0624-DBC3-4E6A-95FA-1F53F06D78FA@rpi.edu>
Hi, Steve,

I’m not an official member of this WG, but am watching on the sidelines licking my wounds from previous WG standardization activities :-)

I’d like to respond to your #3 below.

On Mar 25, 2014, at 2:06 PM, Steven Adler <adler1@us.ibm.com> wrote:

> I understand your point of view and you've written a passionate defense of RDF and Linked Data. Its not my intent to challenge your beliefs, but I would like to point out some facts and my interpretation.
> 
> 1. According to IDC the RDF market was $10M in 2012 and is estimated to grow to $170M in 2017. That might sound like a lot but in comparison to the overall database market of about $30B its peanuts.
> 
> 2.  There might be government implementations of RDF but there are no Open Data implementations.
> 
> 3.  Using this WG to sell RDF and Linked Data to Open Data customers who are not yet using it is not an Open Standards activity. Its a commercial activity.


I would think that it *is* the purpose of the World Wide Web Consortium to “sell” the approaches that it deems to be most worthwhile.
The W3C has spent significant effort developing a particular flavor of information technologies that are well suited to the Web.
RDF and Linked Data are the W3C’s version of open data standards.
W3C doesn’t sell for money, it sells for concept. If its “customers” aren’t buying, then they’re free to go adopt non-Web standards.
But it certainly isn’t the W3C’s job to provide any and all information technology to any and all — only those centered on the Web and to those whom wish to be on the Web.

There are plenty of other standards bodies. They can sell their perspective, too.

:: scurries back to the bleachers ::

Regards,
Tim Lebo



> 
> 4.  There are many other issues Open Data creates around dq and data comparability that have use cases, documented needs, that need standards.
> 
> All of the Above to me means including the RDF and Linked Data approaches with other approaches that solve problems for current implementations too.
> 
> And. Also.
> 
> Not or.
> 
> Not only.
> 
> Regards,
> 
> Steve
> 
> 
> ----- Original Message -----
> From: Phil Archer [phila@w3.org]
> Sent: 03/25/2014 03:54 PM GMT
> To: Steven Adler
> Cc: public-dwbp-wg <public-dwbp-wg@w3.org>
> Subject: Re: Data "on" the Web vs Data "in" the Web
> 
> 
> 
> Steve, I cannot let this e-mail of yours go unchallenged.
> 
> On 25/03/2014 14:57, Steven Adler wrote:
>> Those are good comments.  The graph data market is pretty small today,
>> with interest pretty evenly split between RDF and Property Graph.
> 
> Really? I'm very interested to know about uses of Property Graphs as
> there has been discussion about possibly standardising that at W3C.
> We're having difficulty getting a sufficiently large community of
> members together to do this.
> 
>  There
>> are some things Graph databases can do, especially in social networking
>> examples, that perform much better than traditional databases.
>> 
>> But no one today is using RDF or Graph Data in any Open Data
>> implementation and no one has any plans to do so.
> 
> 
> This statement is grossly untrue. Off the top of my head examples of
> public sector use of Linked Data (open or not):
> 
> UN FAO
> European Environment Agency
> The BBC, New York Times etc.
> The European Commission
> UK Government
> Deutsche National Bibliotek
> Ordnance Survey
> British Geological Survey
> The Italian Government
> The US EPA
> The US DoH
> 
> Away from the public sector it's used extensively in finance, health
> care and life sciences are making extensive and growing use of it etc.
> etc. Check out http://semanticweb.com/ for more news of the substantial
> and growing commercial use of the technology.
> 
>  Cities and State
>> governments have limited budgets and resources and very limited skills.
> 
> True of course. The point of this WG is to show those people how to make
> the best of what limited resources they have in this regard. They should
> be able to benefit from the power of LD even if they don't necessarily
> use it as a core tech themselves or even know they're using it.
> 
>> 
>> And RDF and Graph are not the only ways to skin this cat...
> 
> No, but it is the best available method that can operate at Web scale
> and that maximises the benefits of the network effect.
> 
>> 
>> Does Data Quality need linked data vocabularies to offer value?
> 
> URIs as identifiers and LD vocabularies are how you share meaning across
> different datasets. Datasets that use external vocabularies are more
> valuable than ones that just use internal codes that are meaningless out
> of that specific context.
> 
> Wouldn't
>> standardized lineage and certification suffice?
> 
> Not sure what you mean by these.
> 
>  I can see the value of
>> graph search for Data Comparability, but well defined metadata would also
>> take Open Data to the next level.
> 
> That's what the CSV on the Web WG is about, for example - well defined
> metadata for tabular data that can, among other things, be used to
> generate LD from the table.
> 
> 
>  If we only recommend RDF and Linked
>> Data as Best Practices for Data Publishing and only a small fraction of
>> the market can use them, what good have we done?
> 
> See previous. It's about making the benefits of LD, i.e. using the Web
> as a data platform, not just a means for shifting PDFs and CSVs from one
> place to another. We want people to get the most from their efforts to
> bring efficiencies and transparency, as well as supporting the
> commercial knowledge economy.
> 
>> 
>> I want to make sure that the work we do has maximum impact and so far use
>> case evidence does not convince me that RDF and Linked Data alone will get
>> us there.
> 
> No one is suggesting that it alone will get us there. But everything we
> do should support the goal of making use of the power of the Web. That
> means making sure people use URIs as identifiers, or that they publish
> metadata such that their non-URI identifiers can be rendered as URIs;
> that relationships are defined etc.
> 
> The Editor's Draft that Augusto began this thread by pointing to is
> interesting [1]. That talks about how to put links into non-RDF and
> non-HTML formats. +1 to that.
> 
> Phil.
> 
> [1] http://tools.ietf.org/html/draft-kelly-json-hal
> 
> 
>> 
>> 
>> From:
>> Christophe Guéret <christophe.gueret@dans.knaw.nl>
>> To:
>> Steven Adler/Somers/IBM@IBMUS
>> Cc:
>> Christophe Gueret <christophe.gueret@dans.knaw.nl>, Augusto Herrmann
>> <augusto.herrmann@gmail.com>, "hellmatic@gmail.com" <hellmatic@gmail.com>,
>> public-dwbp-wg <public-dwbp-wg@w3.org>
>> Date:
>> 03/25/2014 07:31 AM
>> Subject:
>> Re: Data "on" the Web vs Data "in" the Web
>> 
>> 
>> 
>> Hoi Steve,
>> Last year Facebook announced its graph search function, choosing the power
>> of semantic search without RDF.  What I have learned from this WG
>> experience so far is that W3C doesn't really create open standards. It
>> creates and enhances and promotes W3C standards.
>> I've some difficulties to follow you on that one, aren't W3C standards
>> open ?
>> The rest of the world often thanks W3C for its ideas and then implements
>> those ideas in different ways.
>> I thought this was rather common in industry. People copy each other and
>> spend time re-branding the same ideas, also probably to go around patents
>> while re-using things that are indeed good ideas. E.g. "retina display"
>> versus "hd screen", "facetime" VS "hangout" VS "videoconference", "like"
>> VS "+1", google's graph VS facebook's graph, etc ...
>> Can we, this WG, imagine creating or recommending standards that are
>> objective - that describe things to do that anyone can do with or without
>> RDF?
>> We'll see... :)
>> 
>> Regards,
>> Christophe
>> 
>> Regards,
>> 
>> Steve
>> 
>>   From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
>>   Sent: 03/24/2014 04:01 PM CET
>>   To: Steven Adler
>>   Cc: Christophe Gueret <christophe.gueret@dans.knaw.nl>; Augusto Herrmann
>> <augusto.herrmann@gmail.com>; "hellmatic@gmail.com" <hellmatic@gmail.com>;
>> public-dwbp-wg <public-dwbp-wg@w3.org>
>> 
>>   Subject: Re: Data "on" the Web vs Data "in" the Web
>> 
>> 
>> So now we are creating W3C standards for publishing data as unstructured
>> text on websites?
>> The message of this presentation is actually quite the opposite ;-)
>> Instead the idea is to use the Web as platform to host the data. That is,
>> instead of publishing datasets as resources use URIs and HTTP to gain
>> access to specific (structured !) elements of data sets which can be
>> linked and re-used. There has to be a structure and there has to be links
>> possibility but this does not mean that RDF is the only model that will
>> work out and that RDF/XML is the only way to serialise data.
>> 
>> Is that what's in the charter?  Honestly I have always found the charter
>> to be confusing. Maybe it was intended to be machine readable. ;-
>> 
>> :-)
>> 
>> Cheers,
>> Christophe
>> 
>> 
>> 
>> Regards,
>> 
>> Steve
>> 
>>   From: Christophe Guéret [christophe.gueret@dans.knaw.nl]
>>   Sent: 03/24/2014 03:42 PM CET
>>   To: Steven Adler
>>   Cc: Augusto Herrmann <augusto.herrmann@gmail.com>; "hellmatic@gmail.com"
>> <hellmatic@gmail.com>; DWBP Public List <public-dwbp-wg@w3.org>
>> 
>>   Subject: Re: Data "on" the Web vs Data "in" the Web
>> 
>> Hoi,
>> 
>> I think this (semantic !) discussion around data "on" and "in" can be a
>> good way to let people see the difference being putting a link to a
>> resource which is a data set dump ("on") and providing some kind of API
>> ("in") - whatever the technologies of the API are. Lately, I've been using
>> that argument to point people to the fact that downloading dumps of data
>> in various forms is like doing document sharing prior to the Web. Coming
>> them to the conclusion that we should publish our data as Web sites. There
>> is a bit of a focus set on SemWeb technologies for that but, really, we
>> could think of many other ways to reach the same result. Here are the
>> slides, comments are most welcome ;-) :
>> http://www.slideshare.net/cgueret/linking-knowledge-spaces
>> 
>> Cheers,
>> Christophe
>> 
>> 
>> 
>> On 20 March 2014 15:17, Steven Adler <adler1@us.ibm.com> wrote:
>> Augusto,
>> 
>> I am interested in learning about HAL and look forward to this discussion.
>>  But I am a bit concerned with the way you phrase these sentences:
>> 
>> "There should be a way to at first publish open data resources that are
>> linked, but without rdf, such as in xml and json. Then, at a later date,
>> improve with a descriptive rdf vocabulary and expressed in rdf to become
>> linked open data (preferrably, if possible, keeping compatibility with
>> clients that implemented reading the previous non-semantic version)."
>> 
>> To me this reads that non-rdf methods like xml and json are accommodations
>> to constituents who "have not yet seen the light of RDF" and I want to
>> make sure we are providing best practices standards recommendations to the
>> world that exists rather than the "perfect world" we would like someday to
>> exist.
>> 
>> At IBM, we make software that runs on many operating systems.  Of course
>> we employ people with preferences for OSX, Linux, Systemz, AIX, Unix, and
>> even Windows.  Heck, many ATMS around the world still run on OS/2...
>> 
>> But because our customers run all of the above we supply them with all of
>> the above solutions.
>> 
>> Can we agree on an "all of the above" approach to DWBP (without suggesting
>> that everything someday becomes RDF) too?
>> 
>> Best Regards,
>> 
>> Steve
>> 
>> Motto: "Do First, Think, Do it Again"
>> 
>> 
>> From:
>> Augusto Herrmann <augusto.herrmann@gmail.com>
>> To:
>> DWBP Public List <public-dwbp-wg@w3.org>
>> Date:
>> 03/19/2014 01:16 PM
>> Subject:
>> Re: Data "on" the Web vs Data "in" the Web
>> 
>> 
>> 
>> 
>> 
>> Hi,
>> 
>> this is a very important point, Ig. My thoughts exactly when I suggested
>> we look at the Hypertext Application Language (HAL) proposal [1] in the
>> first meeting. It was in fact an invitation for us to think about data
>> "in" the web, as in "part of the web itself". We don't necessarily have to
>> follow HAL, but should look at is as a source of inspiration. The way
>> links are represented in resources in Subbu Allamaraju's RESTful
>> Webservices Cookbook [2] is another source of inspiration.
>> 
>> We should think of standard ways to insert links to other data into many
>> common open data formats, such as xml, json and maybe even csv.. Of course
>> this linking requirement is satisfied by linked open data and rdf, but
>> sometimes organizations have some data and are willing to pubilsh, but
>> initially do not have the necessary resources (i.e. people, knowledge) to
>> develop vocabularies to describe the data. However, interlinking among
>> resources of a dataset, or even linking to resources in other datasets is
>> somewhat easier to do. There should be a way to at first publish open data
>> resources that are linked, but without rdf, such as in xml and json. Then,
>> at a later date, improve with a descriptive rdf vocabulary and expressed
>> in rdf to become linked open data (preferrably, if possible, keeping
>> compatibility with clients that implemented reading the previous
>> non-semantic version).
>> 
>> Perhaps this could become a use case for the Best Practices document.
>> 
>> [1] http://tools.ietf.org/html/draft-kelly-json-hal
>> [2] http://books.google.com.br/books?id=LDuzpQlVuG4C
>> 
>> All the best,
>> Augusto Herrmann
>> Open Data Team - Ministry of Planning - Brazil
>> 
>> 
>> On Mon, Mar 17, 2014 at 9:28 AM, Ig Ibert Bittencourt <ig.ibert@gmail.com>
>> wrote:
>> Hello DWBP,
>> 
>> I was reading again about the 5 Start for Open Data and I saw this
>> affirmation below about 3 starts Web Data [1] that I think would be
>> interesting to share with this WG.
>> 
>> Excellent! The data is not only available via the Web but now everyone can
>> use the data easily. On the other hand, it's still data on the Web and not
>> data in the Web.
>> 
>> 
>> With regards this affirmation, you can see more details in [2] and [3],
>> but not that much.
>> 
>> 
>> [1] http://5stardata.info/
>> [2] http://webofdata.wordpress.com/2010/03/01/data-and-the-web-choices/
>> [3] http://lists.xml.org/archives/xml-dev/200211/msg01290.html
>> 
>> 
>> Best,
>> 
>> Ig Ibert Bittencourt
>> Professor Adjunto III - Universidade Federal de Alagoas (UFAL)
>> Vice-Coordenador da Comissão Especial de Informática na Educação
>> Líder do Centro de Excelência em Tecnologias Sociais
>> Co-fundador da Startup MeuTutor Soluções Educacionais LTDA.
>> 
>> 
>> 
>> 
>> 
> 
> --
> 
> 
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
> 
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
> 
> 
>
Received on Tuesday, 25 March 2014 18:19:04 UTC