Re: Exercise: LOD questions (R)†was ( Do we need another list(s)? ) from Kingsley Idehen on 2008-12-06 (public-lod@w3.org from December 2008)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 06 Dec 2008 11:49:07 -0500
CC: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <493AAD03.5090806@openlinksw.com>
Juan Sequeda wrote:
> This is a series of questions that I received from a member in our 
> Semantic Web Austin group. I can't seem to answer all of them 
> thoroughly. Hope we can get more answers on this and we can start 
> creating material for the FAQ I am proposing.
>
> ----------------------
>
> These are some things I'd like to see in an FAQ. Here are a few 
> questions from the
> point of view of a business person who is vaguely aware of LOD but not
> clear on its use:
>
> - My company has recently released an API for access to structured
> (database) data about 55 million companies and 35 million people. Do
> you think I should release this in an LOD format? How would my
> customers benefit.
Your customers will benefit from the granular access that Hyperdata 
oriented linking accords.

LOD is a community moniker for a collection structured data sets that 
are navigable using Hyperdata [1] linking.

 From a high level perspective, Hyperdata linking is a Hypertext linking 
mechanism that connects entities to entities within and across their 
host containers (documents). It is distinguished from Hypertext linking 
by way of its entity level focus as opposed to container level focus.

Relationships between the People places and other things in you database 
will be accessible to structured querying by your customers without the 
need to them to programmatically perform the same from the output of 
your Web Services.

Your API is a DBMS API, the DBMS in this case is where you value lies. 
Thus, the more granular the access to the data, the more valuable your 
service be to your target audience.
>
> - Can you give a use case for mixing LOD with privately supplied data
> (from my companies own data sources or from user-generated content) to
> produce a useful application?
>
Yes, think about the demographic data in the public domain which is now 
in structured from courtesy of LOD. For instance, basic market research 
is enhanced if you can mesh internal data from you CRM and related 
systems with data about countries and competitors etc..

The CIA Factbook, Freebase, Wikipedia, Word Bank etc.. are examples of 
structured data sources that can be easily meshed with your internal 
data sources courtesy of Linked Data.

> - What commercial applications are there that use LOD?
>
OpenLink Software provides a broad range of products that natively 
incorporate Linked Data. The range goes from a DBMS (Virtuoso) platform 
all the way up to a Distributed Collaborative Application Suite (ODS) 
with many other pieces in between.  Linked Data is fundamentally a 
feature of all OpenLink products, because when all is said an done 
"Linked Data" is a feature (re. Enterprise or Web solutions).

> - What are some of the major limitations of today's software that
> would be improved upon by using LOD?
Inaccessibility to data in granular form.

Data Access technologies have existed in various forms since the advent 
of the modern IT era. What hasn't been achieved until the emergence of 
the Linked Data meme is the ability to  Name and Dereference structured  
data in a totally platform independent manner. You can name a record and 
dereference [2] a representation of the description of that record via 
HTTP; something that has simply not been achieved with such openness in 
any other computing realm until now.

>
> - How does LOD fit into the bigger Semantic Web picture?
LOD is a moniker for a community and its output.

Linked Data is the foundation of the Semantic Web vision. Without the 
granular access to structured data that it accords, the vision remains 
mercurial and somewhat incomprehensible.

> Write a
> seperate sentence or two for each of the following terms that states
> how that topic relates to LOD:
> RDF
RDF is the framework for describing resources (data objects or entities) 
is structured form.
> OWL
Is the data dictionary language for Linked Data.

> Resource
An Object or Entity.
> URI
Object or Entity Identifier [3]

> Ontologies
Data Dictionary.
> Agent
Client that consumes a service on behalf of someone or something else.

> Service discovery

> Triples
How records are created in
Entity-Attribute-Value oriented databases.

RDF is basically an implementation of Entity-Attribute-Value [4] DBMS 
technology  .

>
> Example: "RDF is a standard format that can be used to publish LOD
> datasets."
>
> - Okay, I'd like to use LOD for a pilot of a commercial project. I'm
> going to include 1 million triples. What production-environment
> resources will I need to set up. What will my architecture include?
> Will there just be a giant RDF file or a big set of them? Will they
> just be front-ended by a web server? Will a database be needed?
>
You publish data in Linked Data form via a Linked Data aware / capable 
server. Just as you server up Documents on the Web via a Web Server or 
how you server RDBMS records up via an RDBMS server etc..

> - Can I build a proprietary closed source application that
> incorporates LOD? How would I combine free and fee-based data? I know
> how to do it with an API. How would I do it with linked data?
>
A proprietary application can consume Linked Data and do whatever it 
chooses subject to the licensing associated with the data corpus it is 
consuming.

To conclude, Linked Data is HTTP based Data Access by Reference.
Data Access by Reference has existed from day one in the computer realm. 
The only difference is that it has never existed at the Wide Area 
Network level in the manner delivered by HTTP via the Linked Data meme.


Links:

1. http://en.wikipedia.org/wiki/Hyperdata
2. http://en.wikipedia.org/wiki/Dereferencable_URI
3. http://en.wikipedia.org/wiki/Object_identifier
4. http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
5. http://tinyurl.com/568y9g - My Linked Overview Presentation given at 
Linked Data Planet (remix of bits from my session & TimBL's)



Kingsley
> Juan Sequeda, Ph.D Student
> Dept. of Computer Sciences
> The University of Texas at Austin
> www.juansequeda.com <http://www.juansequeda.com>
> www.semanticwebaustin.org <http://www.semanticwebaustin.org>
>
>
> On Fri, Dec 5, 2008 at 10:32 PM, Juan Sequeda <juanfederico@gmail.com 
> <mailto:juanfederico@gmail.com>> wrote:
>
>     Giovanni
>
>     Great answers! I really hope other people will start commenting on
>     this questions, giving answers, or making more answers.
>
>     You are right, maybe education is not the right word. However I do
>     think we need to do outreach. With respect to the brainstorming
>     that you suggest; I think this is what we are doing now. I truly
>     believe that this community should brainstorm more about how we
>     should do the outreach :)
>
>     I propose to create a FAQ about Linked Data (hopefully on the
>     official linked data web site). But to do so, we need the
>     frequently asked questions! Hopefully we can start putting this
>     together.
>
>     Juan Sequeda, Ph.D Student
>     Dept. of Computer Sciences
>     The University of Texas at Austin
>     www.juansequeda.com <http://www.juansequeda.com>
>     www.semanticwebaustin.org <http://www.semanticwebaustin.org>
>
>
>     On Fri, Dec 5, 2008 at 6:01 PM, Giovanni Tummarello
>     <giovanni.tummarello@deri.org
>     <mailto:giovanni.tummarello@deri.org>> wrote:
>
>         i agree on all your comments and believe me by talking to
>         actual web
>         2.0 people you're way ahead.
>         i'll try to answer some of your questions
>
>         > I then asked if they new the value of Linked Data. The
>         answer I got was
>         > "well, i would think that my site would be easier to find
>         right? i mean, i
>         > would link stuff on my site better"
>
>         lets see the key point here:
>
>         * There is a site
>         * There are human visitors as HUMANS Bring money/business not
>         machines
>         * There is a perception that metadata can help to find things
>         better.
>
>         >
>         > Question 2: Would my site be easier to find then using
>         Linked Data?
>
>         Answer: no, matter of fact you open your data to being used
>         without
>         getting any visitors.
>
>         >
>         > Question 3: So are microformats in my pages doing Linked Data?
>
>         they are not doing "linked data" but in practice the do answer the
>         questions above or practically go well in that direction. see
>         next one
>
>         >
>         > Question 4: By what method are these things linked?
>         >
>
>
>         2 pages have the same vcards = you can link them. They have a
>         me link
>         = they are linked they have 2 events on the same date, same city =
>         they should be grouped they might be interesting to show
>         together to
>         the user.
>
>         They are in practice linked by simple, practical use cases which
>         involve finding/related pages (real sites which want to get
>         traffic)
>         for users (real people who want to get pages)
>
>
>         > After explaining somebody what linked data was, and giving
>         them the existing
>         > links about it, question 5 came up:
>         > Question 5: "I see value in the data and the data being
>         linked together but
>         > i don't see practically how i would use it"
>
>         big technical barrier in using it with the Lod model.
>
>         on the other hand querying Freebase is infinitely simpler
>         solving :
>
>         * the access problem . a single language accesses all the datasets
>         they have integrated, no hopping around, very fast
>         * the data omogeneity and quality problem, they care about the
>         dataset
>         and import only clean stuff
>         * identifiers omogeneity, big efforts are made to smush things
>         together
>         * Ontology issues: both a clear taxonomy is defined AND all the
>         sources that are integrated are harmonized to it.
>         * the multiple points of failure problem
>
>         So since i believe querying large datasets of structured,
>         matched data
>         is in fact very useful once one gets a slightely bit creative
>         i think
>         they'll have success. Could i buy some of their shares i would
>         do it
>         :).
>
>         I dont think its a coincidence that some of the smartest
>         people who
>         worked on semantic web now work for them.  (but of course there is
>         much more than a good idea for a successful business so they
>         might go
>         bust anyway obviously)
>
>         >
>         > A final quote "people like me don't a) know about this and
>         b) don't
>         > understand how to use it once they do? I would say some
>         additional education
>         > is necessary to make this understood... i would also say
>         that in a broader
>         > sense the semantic web message has gotten lost under a mass
>         of acryonyms and
>         > theory"
>
>         for a more articulate attept at an explanation of what happened i
>         agree a lot with this post
>          http://inamidst.com/whits/2008/technobunkum by Sean Palmer
>
>         I dont think "more education" is needed Juan, one really
>         should teach
>         something if .. the answer is known else its called
>         brainstorming or
>         handwaving (according to weather you're in  good faith or not)
>
>         note that this is all but a bashing on the power of handling
>         loosely
>         structured data and RDF. I think on the other hand RDFa will
>         triumph
>         and so people will be probably making their own little
>         vucabolaries..
>         but starting from the web 2.0 approach and practical "how do i
>         bring
>         visitors, how to do simple site to site integration" use cases.
>
>         Giovanni
>
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Saturday, 6 December 2008 16:49:48 UTC