Re: Linked Data Demand & Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering & analyzing data from websites from Kingsley Idehen on 2012-07-20 (public-lod@w3.org from July 2012)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 20 Jul 2012 14:20:41 -0400
To: public-lod@w3.org, Sebastian Schaffert <sebastian.schaffert@salzburgresearch.at>
Message-ID: <5009A179.6020502@openlinksw.com>
On 7/20/12 11:48 AM, Sebastian Schaffert wrote:
> [SNIP] -- so that we can focus on the key non personal points.
>
> My claim is founded in the many discussions I have when going to the CTOs of*real*  companies (big ones, outside the research business) out there and trying to convince them that they should build on Semantic Web technologies (because I believe they are superior). Believe me, even though I strongly believe in the technology, this is a very tough job without a good reference example that convinces them they will save X millions of Euros or improve the life or their employees or the society in the short- to medium term.

Why do you assume that others (like myself) that don't share your views, 
don't talk to CTOs?  BTW -  there are a number of companies that 
actually have paying customers using Linked Data effectively; these  
companies may not necessarily believe in announcing every customer 
closure related to Linked Data.

>
> Random sample answer from this week (I could bring many): "So this Linked Data is a possibility for data integration. Tell me, why should I convince my engineers to throw away their proven integration solutions? Why is Linked Data so superior to existing solutions? Where is it already in enterprise use?".

I don't know how you've concluded that Linked Data is a "rip and 
replace" approach to technology adoption. Its quite the contrary.
Linked Data's most powerful virtue is its ability to enhance what 
already exists re:

1. data object identity
2. data object representation
3. data object access
4. data object serialization
5  data object access control lists and policies.

Please read some of the older threads on this mailing list. Do you think 
Facebook publishes Linked Data for no good reason? Ditto the U.S. and UK 
governments amongst many other contributors to the LOD cloud? Likewise 
any other enterprise that's already effectively using Linked Data as a 
conceptual model oriented virutalization atop disparate data sources etc?

>
> The big datasets always sold as a success story in the Linked Data Cloud are largely irrelevant to businesses:
> - they are mostly dealing with internal data (projects, people, CRM, ERP, documents, CMS, …) where you won't find information in the LD cloud anyways

There is a difference between the Linked Open Data (LOD) Cloud and 
Linked Data. There's also a subtle difference between Linked Open Data 
and the LOD Cloud.

Linked Open Data is about standards based structured data representation 
and access, based on a specific use of de-referencable URIs to augment 
said data representation and access.

LOD Cloud is about publicly accessible application of the above, with 
contributions from a plethora of sources, across a variety of subject 
matter domains.

> - they do not trust "just some" data from the Internet to build multi-million business decisions on top

See my comment above. That isn't what I am talking about.
> - they find the data in the cloud too messy (as an example: try finding country codes on DBPedia …) and too unreliable (most servers do not respond in sufficient time)

Ditto, not my point. The LOD cloud is a distributed lookup table and 
that's about it.
>
> Mike has actually assembled some very nice blog posts on related topics:
> -http://www.mkbergman.com/917/practical-p-p-p-problems-with-linked-data/
> -http://www.mkbergman.com/859/seven-pillars-of-the-open-semantic-enterprise/

I am no stranger to Mike. Sometimes it helps if you do a few lookups to 
provide context for your responses.

>
>> >
>>> >>And if you look at the recent troubles with Semantic Web business models you see the consequences.
>> >
>> >Please clarify what you mean as that statement is quite unclear. What "recent troubles" are you speaking  (so definitively) about re., the business model scalability and viability of Linked Data and/or the broader Semantic Web vision?
> I was referring to the recent bankruptcy of Ontoprise and the fact that Talis is reducing its Linked Data involvement, essentially shutting down their "we help you publish Linked Data" service. I thought you might have guessed.

Why should I guess. You over simplify those items and I am not in the 
business of speaking about other companies. Talking about markets, 
technologies, and business models are fine for me, but It stops right 
there.

>
>> >
>>> >>
>>> >>You are not the only one in "the community", so please don't say "we've passed the issue".
>> >
>> >Of course I am not the only one in the community. But, I think you are missing a critical point: this forum/list/community is about Linked Data. Thus, I would expect product announcements to be related to Linked Data, at the very least. What's really confusing to me, right now, is the fact that I simply sought an actual Linked Data connection from Hatish (assuming there had to be one somewhere), received push-back about "demand" and a string of replies that are responding something else inferred from my response .
> The problem is that in most of your replies*you*  claim authority over defining what is Linked Data related and what is not, while other people here in the forum might have a completely different opinion.

Asking someone to produce a Linked Data URI in a Linked Data forum/list 
!= claiming authority. It's the most basic instinct for anyone that 
appreciates the virtues of Linked Data.

Linked Data is about URIs.

URIs make Hyperlinks tick, at Internet scale.

Its fundamental to expect a Link to something that basically extends the 
Web.

Imagine posting a PDF to a forum dedicated to hypermedia documents e.g. 
HTML? Why would you push-back if a subscriber asked you for a functional 
Hyperlink ?

>   I found Harish's announcement sufficiently related to Linked Data so as to be one of the most interesting posts for me for some time.

In what way? This is the point I desperately seek. Believe me, I have no 
problem with being proven wrong. It means I have something to learn. 
Thus, please connect Harish's product to Linked Data since it will 
actually address my initial response to him.

I also encourage you to reread the thread, calmly. Doing so will reveal 
following to you:

1. I made it crystal clear I wasn't knocking his effort -- he pushed 
back on that point with MS Access and Excel annecdotes
2. I made it crystal clear I sought a Linked Data URI  -- he pushed back 
with his "demand driven" comment.


Re., #2 I might not have been clear enough about the fact that a 
functioning private de-referencable URI was okay, even if I wouldn't be 
able to do anything with it from a live Linked Data demo standpoint.

>
>> >
>> >
>>> >>I'd say we have not even really started with the issue, we've just pushed some technology out there, not knowing yet whether it is really useful.
>> >
>> >I disagree, and here are some very basic examples of proof that the utility (usefulness) and demand (need) for Linked Data are yesterday's topic:
>> >
>> >1. Facebook -- every data object in this data space has a Linked Data URI, and by that I mean all 850 million+ profile alongside other data objects that represent other aspects of Faceook profiles
> Where is the convinving business application (that I could not realise without URIs, especially since Facebook is anyways a "closed universe" with unique IDs)?

The fact that you and anyone else can lookup and access fine-grained 
structured data from Facebook. Basically, they've opened up their 
massive database for Facebook specific and generic applications, via 
Linked Data. Do you need more details about the business case and model 
implications for that?

>
>> >
>> >2. Various Govts. worldwide -- lead by US and UK govt efforts enhancing Open Data by adhering the principles espoused in TimBL's Linked Data meme
> Where is the convincing business application? Since most of the data is statistics anyways, where is Linked Data superior to say CSV?

Again, how have you arrived at the Linked Data vs CSV scenario? 
Secondly, if you'd done some background lookup, you would have stumbled 
across comments I've made about CSV and Linked Data.

CSV is a great foundation from which to drive Linked Data. It removes 
chunks of tedium from the process while also broadening participation.

>
>> >
>> >3. Rest of the LOD cloud which now tops 55+ billion triples and growing every second.
> Where is the convincing business application?http://km.aifb.kit.edu/projects/numbers/  has also billions of triples.

My point is that you have a massive lookup table and source of 
structured data for a plethora of uses. One that didn't exist before the 
emergence of the LOD Cloud. Today, I can lookup a Zip Code, Country, 
Capital, Unit of Measurement, Company, etc. by de-referencing a URI or 
simply having a query walk a Webby graph starting with a single URI.

>
>
> You are showing me datasets. Show me applications!

I am not showing you datasets. I am referring to structured data and 
lookup capability that facilitates the development of powerful 
applications. In a nutshell, these applications require less code per 
delivered insight. In addition, less code means less decay for 
enterprises and users since applications come and go while data is forever.

>
>
>> >
>> >
>>> >>  On the other hand Harish is giving us one example of where at least part of the technology*might*  be useful and I appreciate this very much. In general, I also prefer acting over talking.;-)
>> >
>> >Useful, of course. But useful in a manner that has relevance to Linked Data is what I sought from my questions. There is no Linked Data in that solution, and all wanted to do was foster dialog that would encourage production of Linked Data as others have already done -- for years -- re. data from Crunchbase.
> Harish mentions in his original post: "The core application is generic - it can consume any kind of rdf/owl data. However, to show case the technology, we dicided to pick one source (crunchbase) to demo the capabilities of the product."

So you are saying, the application ingests structured data but emits 
HTML pages (reports) where the actual data keys (URIs) for the data are 
now dislocated from the value chain? If you consume Linked Data there's 
no reason to obscure access to those data sources in a solution. There 
are a number of best practice patterns for keeping URIs accessible and 
discoverable to user agents. Something I actually attempted to 
demonstrate to Harish in one of my demo links re. URI Debugger.

>
> In his follow-up he says: "Also, if you are wondering how we are leveraging semweb concepts, here are a few pointers

I didn't ask about how we was leveraging Semantic Web concepts. I asked 
him for a Linked Data URI. A URI that would resolve to structured data 
constrained by an EAV/RDF based data model.

> - all entities like companies, investors etc are given uri's

And I sought a single URI as already went through his application 
without success. Thus, as I've already requested, if I somehow missed 
those URIs with my tools please send me a sample. That's all I want.

> - when companies have common investors or are competitors to each other, they are automatically "merged" to create a seamless database

I know that. So do millions of other folks.

I just want to de-reference a Linked Data URI that delivers that 
information via a Linked Data graph .

>
> The long term objective is to do this across websites - ie harness the web of data that backs the web of pages. Right now for instance, we me can automatically merge data grabbed from crunchbase and data grabbed from linkedin public profiles if the crunchbase person profile has a link to the the linkedin profile. Full support for RDFa and LinkedData is part of the roadmap."
>
>
> So what I read is that he builds an application that uses Linked Data (since it can consume any kind of rdf/owl data). Sufficient for me. Actually I even forwarded the mail to my group to have a look at the demo, because I found it very relevant.

Linked Data ingested and something else published != the kind of 
virtuous cycle many of us expect. If there's utility in ingesting Linked 
Data then there's equal utility is keeping the chain intact by ensuring 
the data source URIs remain discoverable to other user agents on the Web.

>
>> >
>> >My response included examples of what's been achieved with Cruncbase data for a very long time, so I hoped he would see the virtues in doing something similar such that in classic Linked Data fashion you end up with a richer Web of Linked Data.
> So there is already a "classic" Linked Data fashion?

Yes, there are clearly established principles for what constitutes 
Linked Data in line with TimBL's meme. I paraphrase the key points as:

1. denote (name) things using de-referencable URIs
2. URIs should resolve to useful information
3. construct useful information from structured data via standards for 
structured data representation
4. refer to other things using there URIs.

Linked Data is about a very specific URI behavior combined with 
structured data representation.

>
> What you send (as always) are links to the URI burner of your own company.

Again, you missed the point. I sent an example of what I sought. A 
Linked Data URI for Facebook derived from Crunchbase. I was hoping that 
Hatish would reply with another Linked Data URI and then I could proceed 
to mesh both data sources en route to demonstrating the implicit 
virtuosity of Linked Data.

Believe it or not, I am happy to demonstrate anyone's Linked Data 
offerings. There should be ample proof of that if you (once again) do a 
little background research.

Linked Data isn't a competitive boundary for me. Thus, please don't make 
inaccurate comments about my intentions.


>   Which*in my opinion*  is much less Linked Data than the application we are discussing.

Yes, because you clearly don't really understand the subject matter 
(Linked Data) or my fundamental points. That statement says it all.

>   It is just a wrapper around a 3rd party API allowing to convert proprietary data into RDF.

Yes, and what else?

>   Also, accessing the URI burner just for the Facebook example you send takes much longer than Semgel's complete data analysis process, so why should he rely on the RDF data you provide instead of directly accessing the JSON API (which is about as much Linked Data as Facebook's Open Graph, btw) and mapping it to RDF himself?

Is that what you do with a DBMS key, off the bat? Do you even have any 
idea how many other services are hitting URIBiurner at any given time? 
Ditto DBpedia and many other LOD cloud spaces? They all exist as 
contributions to what we hope would be a viruous enclave of loosely 
coupled Linked Data sources. No more, no less.

>
>> >
>>> >>
>>> >>Considering comments like yours, I really fear for the community to loose its openness and acceptance of differing opinions.
>> >
>> >What is the differing opinion?
> True, so I wonder why do we have a discussion at all!?
>
>> >
>>> >>  I had already given up really following the discussions here for exactly that reason (and I am not the only one), but this message appeared on my phone before the mail client could sort it away and simply made me upset.
>> >
>> >Sorry for upsetting you, and I hope you become less upset when you understand my point. A simple route to that destination starts by you responding to my questions.
> Which I did. Unfortunately my original state did not really improve much, because I have the impression that you also did not understand my point.

I absolutely do not understand most of your points. All I get from your 
comments are gut reactions and presumptions that you are trying to weave 
into a broken narrative.

>> >I strongly believe you've misunderstood my response, as measured as it was, initially. Thus, let's reconcile all of this, and I am quite confident that my fundamental point will be resurrected and then clearly understood.
> So help me:)

I have, more than I need to, and I am absolutely done with this 
conversation, online. Of course you can pick it up with me offline.

BTW -- take time to digest this presentation: 
http://www.slideshare.net/timoreilly/the-clothesline-paradox-and-the-sharing-economy-keynote-file 
.


Kingsley
>
> Greetings,
>
> Sebastian
> -- | Dr. Sebastian Schaffert sebastian.schaffert@salzburgresearch.at | 
> Salzburg Research Forschungsgesellschaft 
> http://www.salzburgresearch.at | Head of Knowledge and Media 
> Technologies Group +43 662 2288 423 | Jakob-Haringer Strasse 5/II | 
> A-5020 Salzburg


-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Friday, 20 July 2012 18:20:30 UTC