W3C home > Mailing lists > Public > public-lod@w3.org > July 2012

Re: Linked Data Demand & Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering & analyzing data from websites

From: Sebastian Schaffert <sebastian.schaffert@salzburgresearch.at>
Date: Sat, 21 Jul 2012 00:56:19 +0200
Cc: Linking Open Data <public-lod@w3.org>
Message-Id: <48668DEA-2F63-4086-A6F8-5D0A65B83980@salzburgresearch.at>
To: Martynas Jusevičius <martynas@graphity.org>
Dear Martynas,

Thanks for your constructive answer. I completely agree with all your points, and I am looking forward to your software (already checked the README ;-) ). We will surely try it out (maybe as a client for our Linked Media Framework).

The problem I am facing is that part of my (and my group's) current job is to try bringing the technologies we are developing in research into ordinary industry. Not the Microsofts, Facebooks or Oracles (who are all highly innovative in Web and database technologies), but small and big companies from the (traditional) media sector and manufacturing industry who have big IT departments and infrastructures and could benefit greatly from Linked Data and related technologies. They often still live in the world of CORBA, ERP and file systems, and not necessarily in the Web.

With the partners we have we "silently" follow the Linked Data approach by trying to solve their immediate problems and using Linked Data in the background. While in the media sector this is quite successful (see e.g. http://search.salzburg.com, 1.1 million news articles all as Linked Data but the interface is facetted search), it is significantly more difficult explaining the advantages to e.g. manufacturing industries. Some typical problems I already mentioned in my previous post (lack of trust, lack of relevant data, lack of quality). Some others - indirectly related to Linked Data:
- they have proven and working infrastructures, and they have experienced IT engineers knowing "their" stuff; why should they adopt a new technology? They don't have a "Linked Data problem" per se
- IT in such companies is typically a central department and not a business division; they have only limited resources for technology innovation, why invest in Linked Data and not in some other technology where they can say "it will save us X million Euros"? 

Maybe we are targetting the wrong or too difficult sector, true. But I am convinced that the technology is useful especially in such settings, so I want to prove it by building applications that would not be possible otherwise. Unfortunately, I am lacking convincing business cases that shows THEM that the technology is superior. Noone needs to convince ME about the virtues of Linked Data, or otherwise I would not develop software of publish scientific articles related to it. ;-)

If we could collect even a small set of convincing business cases and describe what problems they are solving and how, and also what problems they encountered, I think it would help many of us.

Greetings,

Sebastian

Am 21.07.2012 um 00:16 schrieb Martynas Jusevičius:

> Sebastian, all,
> 
> I'm on your side here. But regarding Linked Data, consider the
> following points that slow down its adoption:
> - data-heavy players such as Facebook and Google might not be
> interested in adopting a new open, even if superior, data approach,
> since it is in their interest to keep as much control over the data as
> possible
> - in the corporate world, big vendors like Microsoft and Oracle have
> created a lock-in, and big companies and organizations are hesitating
> to invest in new long-term solutions
> - the long term is where Linked Data really shines, because while the
> global data interconnectedness increases, it provides linear
> integration costs instead of exponential as in the Web 2.0 API-to-API
> approach
> - RDF and Linked Data are quietly doing their job at research
> institutes and innovative organizations like BBC and are not receiving
> the marketing dollars thrown at NoSQL solutions such as MongoDB.
> However when it comes to production use, NoSQL is no less problematic
> than triplestores (I have some experience in the startup world), while
> RDF is the only standardized NoSQL/graph data model, which even has a
> query language and quite a few tools.
> - RDF and Linked Data are taught at very few schools. Even in computer
> science studies, web application development is often stuck at
> PHP+MySQL level, or Web 2.0 and RESTful APIs at best.
> 
> So I would say Linked Data is like electrical vehicles -- most who
> understand the technology would find it superior, but there are a lot
> of different agendas and interests that not necessarily result in what
> is better for the public. And then there is ignorance as well.
> 
> When it comes to Linked Data applications, I'm about to release to
> open-source code which I hope will make it easier.
> 
> Martynas
> graphity.org
> 
> On Fri, Jul 20, 2012 at 5:48 PM, Sebastian Schaffert
> <sebastian.schaffert@salzburgresearch.at> wrote:
>> Kingsley,
>> 
>> I am trying to respond to your factual arguments inline. But let me first point out that the central problem for me is exactly what Mike pointed out: "In your enthusiasm and cheerleading you as often turn people off as inspire them. You too frequently take it upon yourself to "speak for the community". Semgel is a nice contribution being contributed by a new, enthusiastic contributor. I think this is to be applauded, not lectured or scolded. Semgel is certainly as much on topic as most of the posts to this forum."
>> 
>> The message you should hear is that many people are frustrated by the way the discussions in this forum are carried out and have already stopped contributing or even reading. And this is a very bad development for a community. The topic we are discussing right now is only a symptom. Please think about it.
>> 
>> Am 20.07.2012 um 16:43 schrieb Kingsley Idehen:
>> 
>>> On 7/20/12 4:06 AM, Sebastian Schaffert wrote:
>>>> Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:
>>>> 
>>>>>> I completely understand and appreciate your desire (which I share) to see a mature landscape with a range of linked data sources. I can also understand how a database or spreadsheet can potentially offer fine-grained data access - your examples do illustrate the point very well indeed!
>>>>>> 
>>>>>> However, if we want to build a sustainable business, the decision to build these features needs to be demand driven.
>>>>> I disagree.
>>>>> Note, I responded because I assumed this was a new Linked Data service. But it clearly isn't. Thus, I don't want to open up a debate about Linked Data virtues if you incorrectly assume they should be *demand driven*.
>>>>> 
>>>>> Remember, this is the Linked Open Data (LOD) forum. We've long past the issue of *demand driven* over here, re. Linked Data.
>>>> But I agree. A technology that is not able to fire proof its usefulness in a demand driven / problem driven environment is maybe interesting from an academic standpoint but otherwise not really useful.
>>> 
>>> So are you claiming that Linked Data hasn't fire proofed its usefulness in a demand drive / problem driven environment?
>> 
>> 
>> Indeed. This is my right as much as yours is to claim the opposite.
>> 
>> My claim is founded in the many discussions I have when going to the CTOs of *real* companies (big ones, outside the research business) out there and trying to convince them that they should build on Semantic Web technologies (because I believe they are superior). Believe me, even though I strongly believe in the technology, this is a very tough job without a good reference example that convinces them they will save X millions of Euros or improve the life or their employees or the society in the short- to medium term.
>> 
>> Random sample answer from this week (I could bring many): "So this Linked Data is a possibility for data integration. Tell me, why should I convince my engineers to throw away their proven integration solutions? Why is Linked Data so superior to existing solutions? Where is it already in enterprise use?".
>> 
>> The big datasets always sold as a success story in the Linked Data Cloud are largely irrelevant to businesses:
>> - they are mostly dealing with internal data (projects, people, CRM, ERP, documents, CMS, …) where you won't find information in the LD cloud anyways
>> - they do not trust "just some" data from the Internet to build multi-million business decisions on top
>> - they find the data in the cloud too messy (as an example: try finding country codes on DBPedia …) and too unreliable (most servers do not respond in sufficient time)
>> 
>> Mike has actually assembled some very nice blog posts on related topics:
>> - http://www.mkbergman.com/917/practical-p-p-p-problems-with-linked-data/
>> - http://www.mkbergman.com/859/seven-pillars-of-the-open-semantic-enterprise/
>> 
>>> 
>>>> And if you look at the recent troubles with Semantic Web business models you see the consequences.
>>> 
>>> Please clarify what you mean as that statement is quite unclear. What "recent troubles" are you speaking  (so definitively) about re., the business model scalability and viability of Linked Data and/or the broader Semantic Web vision?
>> 
>> I was referring to the recent bankruptcy of Ontoprise and the fact that Talis is reducing its Linked Data involvement, essentially shutting down their "we help you publish Linked Data" service. I thought you might have guessed.
>> 
>>> 
>>>> 
>>>> You are not the only one in "the community", so please don't say "we've passed the issue".
>>> 
>>> Of course I am not the only one in the community. But, I think you are missing a critical point: this forum/list/community is about Linked Data. Thus, I would expect product announcements to be related to Linked Data, at the very least. What's really confusing to me, right now, is the fact that I simply sought an actual Linked Data connection from Hatish (assuming there had to be one somewhere), received push-back about "demand" and a string of replies that are responding something else inferred from my response .
>> 
>> The problem is that in most of your replies *you* claim authority over defining what is Linked Data related and what is not, while other people here in the forum might have a completely different opinion. I found Harish's announcement sufficiently related to Linked Data so as to be one of the most interesting posts for me for some time.
>> 
>>> 
>>> 
>>>> I'd say we have not even really started with the issue, we've just pushed some technology out there, not knowing yet whether it is really useful.
>>> 
>>> I disagree, and here are some very basic examples of proof that the utility (usefulness) and demand (need) for Linked Data are yesterday's topic:
>>> 
>>> 1. Facebook -- every data object in this data space has a Linked Data URI, and by that I mean all 850 million+ profile alongside other data objects that represent other aspects of Faceook profiles
>> 
>> Where is the convinving business application (that I could not realise without URIs, especially since Facebook is anyways a "closed universe" with unique IDs)?
>> 
>>> 
>>> 2. Various Govts. worldwide -- lead by US and UK govt efforts enhancing Open Data by adhering the principles espoused in TimBL's Linked Data meme
>> 
>> Where is the convincing business application? Since most of the data is statistics anyways, where is Linked Data superior to say CSV?
>> 
>>> 
>>> 3. Rest of the LOD cloud which now tops 55+ billion triples and growing every second.
>> 
>> Where is the convincing business application? http://km.aifb.kit.edu/projects/numbers/ has also billions of triples.
>> 
>> 
>> You are showing me datasets. Show me applications!
>> 
>> 
>>> 
>>> 
>>>> On the other hand Harish is giving us one example of where at least part of the technology *might* be useful and I appreciate this very much. In general, I also prefer acting over talking. ;-)
>>> 
>>> Useful, of course. But useful in a manner that has relevance to Linked Data is what I sought from my questions. There is no Linked Data in that solution, and all wanted to do was foster dialog that would encourage production of Linked Data as others have already done -- for years -- re. data from Crunchbase.
>> 
>> Harish mentions in his original post: "The core application is generic - it can consume any kind of rdf/owl data. However, to show case the technology, we dicided to pick one source (crunchbase) to demo the capabilities of the product."
>> 
>> In his follow-up he says: "Also, if you are wondering how we are leveraging semweb concepts, here are a few pointers
>> - all entities like companies, investors etc are given uri's
>> - when companies have common investors or are competitors to each other, they are automatically "merged" to create a seamless database
>> 
>> The long term objective is to do this across websites - ie harness the web of data that backs the web of pages. Right now for instance, we me can automatically merge data grabbed from crunchbase and data grabbed from linkedin public profiles if the crunchbase person profile has a link to the the linkedin profile. Full support for RDFa and LinkedData is part of the roadmap."
>> 
>> 
>> So what I read is that he builds an application that uses Linked Data (since it can consume any kind of rdf/owl data). Sufficient for me. Actually I even forwarded the mail to my group to have a look at the demo, because I found it very relevant.
>> 
>>> 
>>> My response included examples of what's been achieved with Cruncbase data for a very long time, so I hoped he would see the virtues in doing something similar such that in classic Linked Data fashion you end up with a richer Web of Linked Data.
>> 
>> So there is already a "classic" Linked Data fashion?
>> 
>> What you send (as always) are links to the URI burner of your own company. Which *in my opinion* is much less Linked Data than the application we are discussing. It is just a wrapper around a 3rd party API allowing to convert proprietary data into RDF. Also, accessing the URI burner just for the Facebook example you send takes much longer than Semgel's complete data analysis process, so why should he rely on the RDF data you provide instead of directly accessing the JSON API (which is about as much Linked Data as Facebook's Open Graph, btw) and mapping it to RDF himself?
>> 
>>> 
>>>> 
>>>> Considering comments like yours, I really fear for the community to loose its openness and acceptance of differing opinions.
>>> 
>>> What is the differing opinion?
>> 
>> True, so I wonder why do we have a discussion at all!?
>> 
>>> 
>>>> I had already given up really following the discussions here for exactly that reason (and I am not the only one), but this message appeared on my phone before the mail client could sort it away and simply made me upset.
>>> 
>>> Sorry for upsetting you, and I hope you become less upset when you understand my point. A simple route to that destination starts by you responding to my questions.
>> 
>> Which I did. Unfortunately my original state did not really improve much, because I have the impression that you also did not understand my point.
>> 
>>> I strongly believe you've misunderstood my response, as measured as it was, initially. Thus, let's reconcile all of this, and I am quite confident that my fundamental point will be resurrected and then clearly understood.
>> 
>> So help me :)
>> 
>> Greetings,
>> 
>> Sebastian
>> --
>> | Dr. Sebastian Schaffert          sebastian.schaffert@salzburgresearch.at
>> | Salzburg Research Forschungsgesellschaft  http://www.salzburgresearch.at
>> | Head of Knowledge and Media Technologies Group          +43 662 2288 423
>> | Jakob-Haringer Strasse 5/II
>> | A-5020 Salzburg
>> 
> 

Sebastian
-- 
| Dr. Sebastian Schaffert          sebastian.schaffert@salzburgresearch.at
| Salzburg Research Forschungsgesellschaft  http://www.salzburgresearch.at
| Head of Knowledge and Media Technologies Group          +43 662 2288 423
| Jakob-Haringer Strasse 5/II
| A-5020 Salzburg


Received on Friday, 20 July 2012 22:56:52 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:41 UTC