Re: Can we afford to offer SPARQL endpoints when we are successful? (Was "linked data hosted somewhere") from Juan Sequeda on 2008-11-27 (public-lod@w3.org from November 2008)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Thu, 27 Nov 2008 11:06:56 -0600
To: "Aldo Bucchi" <aldo.bucchi@gmail.com>
Cc: "Richard Cyganiak" <richard@cyganiak.de>, "Hugh Glaser" <hg@ecs.soton.ac.uk>, "public-lod@w3.org" <public-lod@w3.org>, "Kingsley Idehen" <kidehen@openlinksw.com>
Message-ID: <f914914c0811270906l7bc09b0wbaf8bfe643780a59@mail.gmail.com>
Hugh

I agree, it's all about critical mass.

One of the things that I realized is that we are now in the "creating LOD"
stage. However, who has been creating or "opening" this data? The SW
community has been doing it by themselves, and we can not expect them to do
it always. Obviously the reason for this is because it is not a simple task
to do.

So there is an obvious chicken and egg problem. Why would people want to
open their data. What is their benefit. But in a way, it is not a chicken
and egg problem any more, because there already exist LOD!

In my american mentality of getting things done now... I consider that we
should start getting into the "consuming LOD" stage. How? Well that is why
we came up with SQUIN. Is it the best solution? I dont know? Is it
scalabable? Not yet. Is it ambitious? Yes. But we need to start with
something now because there is sufficient data out there that can be
queried, consumed and used! Once developers realize the potential of LOD,
then the cycle continues both ways: in a creating and consuming stage.

One of the importante things is that we need to make the "creating LOD"
stage as easy as possible! That is what I am interested (my research) and
what the W3C XG RDB2RDF is doing.

My two cents, while I am thinking of the turkey I will be having tonight.
Happy Thanksgiving to all


Juan Sequeda, Ph.D Student

Research Assistant
Dept. of Computer Sciences
The University of Texas at Austin
http://www.cs.utexas.edu/~jsequeda
jsequeda@cs.utexas.edu

http://www.juansequeda.com/

Semantic Web in Austin: http://juansequeda.blogspot.com/


On Thu, Nov 27, 2008 at 10:47 AM, Aldo Bucchi <aldo.bucchi@gmail.com> wrote:

>
> All,
>
> A simple metadata on this therad, from my POV.
> I am obviously missing some more thorough analysis.
> What I intend to show is just how this discussions usually tend to be
> too broad and get dispersed.
>
> I asked a colleague ( he is a PhD, very smart guy ) to tell me what he
> thinks. He's pretty much in line with me.
>
> * Hugh is worried about *why* would anyone go through the problems of
> publishing their data and points out some problems
> * Juan talks about how this benefits the end developers
> * Peter provides his experience as a provider of a large dataset
> * Kingsley provides some tech background to alleviate the concerns and
> cross referes with a broader brackground drawing from his vast
> experience
> * I tried to cut the SQL vs SPARQL link and factor in the "complex
> system" scenario
>
> Etc...
>
> Maybe the problem with the SW group is that is has drawn people that
> are far too smart and given them too much power. We should, at this
> point, be giving some things for granted instead of keeping on
> touching the elephant.
>
> We might need a big corporate like M$ to tell us that we "need to open
> up our datasets" and period.
>
> Not to say that there is no room for free discussion and that many
> observations might be fair, but there should be a place where people (
> the rest of the world ) can go and just find a simple, recomforting
> answer: "aaaaah, I can open my data. it is good."
>
> ( perhaps we could create a new list for "hindsight discussions" ).
> LOD list should be reassuring.
>
> Just like GMail did it for Ajax. Before that authoritative example,
> there was too much room for debate. Heck, I was using ajax back in
> 1999 and I didn't even know it. But it was a mess. Now it is an
> organized industry.
>
> It's all about critical mass, we socially act like sheep. We look for
> reassurance.
>
> Please, if my analysis of this thread was too shallow it is because I
> am having lunch... do your own and notice that we are all just
> throwing in our own comments and not really going anywhere concrete,
> giving the idea of too much academia.
>
> I have been selling SW projects for many years I have some say this...
> we need to balance the concrete/abstract equation for PR.
>
> Best,
> A
>
> On Thu, Nov 27, 2008 at 9:12 AM, Richard Cyganiak <richard@cyganiak.de>
> wrote:
> > Hugh,
> >
> > Here's what I think we will see in the area of RDF publishing in a few
> > years:
> >
> > - few public SPARQL endpoints over popular datasets (for obvious reasons)
> >
> > - linked data sites offer limited query capabilities (e.g. a scientific
> > bibliography site could offer "search paper by title", "search author by
> > name", "search paper by category and/or date range") (think the "advanced
> > search" form on a website, clad into a REST-style API that returns RDF)
> >
> > - those query capabilities are described in RDF and hence can be invoked
> by
> > tools such as SQUIN/SemWebClient to answer certain queries efficiently
> >
> > - everyone who wants more advanced query capabilities, will crawl the
> site
> > and run their own local SPARQL store
> >
> > At the moment we don't have the technology for describing non-SPARQL
> query
> > interfaces in RDF, and crawling linked data is still a fairly complex
> > business. As long as these problems are not solved, we pretty much are
> stuck
> > with SPARQL endpoints.
> >
> > Best,
> > Richard
> >
> >
> > On 27 Nov 2008, at 00:18, Hugh Glaser wrote:
> >
> >>
> >> Prompted by the thread on "linked data hosted somewhere" I would like to
> >> ask
> >> the above question that has been bothering me for a while.
> >>
> >> The only reason anyone can afford to offer a SPARQL endpoint is because
> it
> >> doesn't get used too much?
> >>
> >> As abstract components for studying interaction, performance, etc.:
> >> DB=KB, SQL=SPARQL.
> >> In fact, I often consider the components themselves interchangeable;
> that
> >> is, the first step of the migration to SW technologies for an
> application
> >> is
> >> to take an SQL-based back end and simply replace it with a SPARQL/RDF
> back
> >> end and then carry on.
> >>
> >> However.
> >> No serious DB publisher gives direct SQL access to their DB (I think).
> >> There are often commercial reasons, of course.
> >> But even when there are not (the Open in LOD), there are only search
> >> options
> >> and possibly download facilities.
> >> Even government organisations that have a remit to publish their data
> >> don't
> >> offer SQL access.
> >>
> >> Will we not have to do the same?
> >> Or perhaps there is a subset of SPARQL that I could offer that will
> allow
> >> me
> >> to offer a "safer" service that conforms to other's safer service (so it
> >> is
> >> well-understood?
> >> Is this defined, or is anyone working on it?
> >>
> >> And I am not referring to any particular software - it seems to me that
> >> this
> >> is something that LODers need to worry about.
> >> We aim to take over the world; and if SPARQL endpoints are part of that
> >> (maybe they aren't - just resolvable URIs?), then we should make damn
> sure
> >> that we think they can be delivered.
> >>
> >> My answer to my subject question?
> >> No, not as it stands. And we need to have a story to replace it.
> >>
> >> Best
> >> Hugh
> >>
> >> =======================
> >> Sorry if this is a second copy, but the first, sent as a new post,
> seemed
> >> to
> >> only elicit a message from <list-help@frink.w3.org> and I can't work
> out
> >> or
> >> find out whether it means the message was rejected or something else,
> such
> >> as awaiting moderation.
> >> So I've done this as a reply.
> >> =======================
> >> And now a response to the message from Aldo, done here to reduce
> traffic:
> >>
> >> Very generous of you to write in this way.
> >> And yes, humour is good.
> >> And sorry to all for the traffic.
> >>
> >> On 27/11/2008 00:02, "Aldo Bucchi" <aldo.bucchi@gmail.com> wrote:
> >>
> >>> OK Hugh,
> >>>
> >>> I see what you mean and I understand you being upset. Just re-read the
> >>> conversation word by word because I felt something was not right.
> >>> I did say "wacky"... is that it?
> >>>
> >>> In that case, and if this caused the confusion, I am really sorry.
> >>>
> >>> I was not talking about your software, this was just a joke. Talking in
> >>> general.
> >>> You replied to my joke with an absurd reply.
> >>>
> >>> My point was simply that, if you want to push things over the edge,
> >>> why not get your own box. We all take care of our infrastructure and
> >>> know its limitations.
> >>>
> >>> So, I formally apologize.
> >>> I am by no means endorsing one piece of software over another ( save
> >>> for mine, but it does't exist yet ;).
> >>> My preferences for virtuoso come from experiential bias.
> >>>
> >>> I hope this clears things up.
> >>> I apologize for the traffic.
> >>>
> >>> However, I do make a formal request for some sense of humor.
> >>> This list tends to get into this kind of discussions, and we will
> >>> start getting more and more visits from outsiders who are not used to
> >>> this sort of "sharpness".
> >>>
> >>> Best,
> >>> A
> >>>
> >>
> >>
> >
>
>
>
> --
> Aldo Bucchi
> U N I V R Z
> Office: +56 2 795 4532
> Mobile:+56 9 7623 8653
> skype:aldo.bucchi
> http://www.univrz.com/
> http://aldobucchi.com
>
> PRIVILEGED AND CONFIDENTIAL INFORMATION
> This message is only for the use of the individual or entity to which it is
> addressed and may contain information that is privileged and confidential.
> If
> you are not the intended recipient, please do not distribute or copy this
> communication, by e-mail or otherwise. Instead, please notify us
> immediately by
> return e-mail.
> INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL
> Este mensaje está destinado sólo a la persona u organización al cual está
> dirigido y podría contener información privilegiada y confidencial. Si
> usted no
> es el destinatario, por favor no distribuya ni copie esta comunicación, por
> email o por otra vía. Por el contrario, por favor notifíquenos
> inmediatamente
> vía e-mail.
>
>
Received on Thursday, 27 November 2008 17:07:35 UTC