W3C home > Mailing lists > Public > public-lod@w3.org > November 2018

Re: How to use LOD for practical purposes?

From: Christopher Gutteridge <totl@soton.ac.uk>
Date: Mon, 19 Nov 2018 11:05:48 +0000
To: Laura Morales <lauretas@mail.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <7239619d-471f-cae8-fd97-e36d6e44732c@soton.ac.uk>
I find RDF a reasonably useful tool in my toolbox, like JSON, XML etc.

I like the idea to reuse URIs so that two RDF datasets can be "mashed 
up" with no (well, less) effort.

I find the idea of a document retrieved by a URI being identified by 
that URI as a problem, for several reasons, and would prefer that the 
URI for a document returned on the web was in the HTTP response header, 
or in the document itself.

I find the idea of resolving a URI to get more information about it has 
largely failed. There's no "contract" about what is to be returned, so 
you have to make the request to see what you get. This is not a good API 
for automated systems. You might get something useful. You might get 
100Mb of crud. It's handy for humans who can copy with whatever they 
get. I've tried to address this in our OPD system by saying that the 
document that describes an organisation's profile may contain anything, 
but certain things should be described in a specific way. I think that's 
a form of "application profile" which states a restricted way to use a 
vocabulary to make it possible to generate, validate and consume by 
automated systems. <http://opd.data.ac.uk/>

Finally mashing up datasets from different sources will encounter many 
issues around assumptions. Statements can be contradictory, but still 
true, because their contexts differ. What matters to one dataset doesn't 
matter to another. eg. the location of my University can be shown as a 
single lat/long of our main postal & admin hub, but it's not the *whole* 
truth and if you wanted to find all university owned property in the UK 
then that's useless and misleading. Also, did you mean "owned" or 
"occupied" because we rent some buildings... but call them "our" 
buildings. etc. People also conflate meanings. The university Library 
is: A building, an organisation, a point-of-service and a collection of 
media. All these are true but these are disjoint things an organisation 
is not a collection of media!  I feel that the Marvel movies are a good 
example of what happens when you try to make too many logically 
inconsistent things exist in one place (I don't like comic book 
crossovers!). See also a blog post I wrote on this issue 
<https://blog.soton.ac.uk/webteam/2010/09/02/the-modeler/>


On 19/11/2018 10:28, Laura Morales wrote:
> As a newcomer to LOD, I find using LOD very very very confusing and impractical. To be clear, I think that by now I understand the model pretty well; the theory behind it. My problem is that most *practical* uses of LOD have been a really bad experience, in particular when linking different data sources coming from different places. It's pretty easy to reason about a single graph, that is a consistent graph that was built in one piece, since everywhere around the graph the structure is usually fairly constant and predictable. But when I want to get information from two or more linked graphs... oh boy... they can be using different types, different ontologies for the same thing (or even custom ones!), different ways of linking, different conventions, ... As a human, I can more or less navigate through the graphs: I start somewhere and I follow the links, and I find something that makes sense to me. But I can't see a computer doing this kind of work; it's a hard problem that has got to require some kind of intelligence.
> So my question really is: what am I, the user, supposed to do to get information out of linked graphs? I download 2, 3, o 4 graphs from various sources, then what? Am I supposed to make my own graph by inferring/reasoning/extracting data from those sources in order to make them more reasonable? In a perfect world all those graphs would be all perfectly linked and plug-and-play but this clearly is not how people publish their data. Or is there a magical way to make all this information *practical* to use, something that a computer can use without requiring an ultimate AI?
>

-- 
Christopher Gutteridge <totl@soton.ac.uk>
You should read our team blog at http://blog.soton.ac.uk/webteam/



Received on Monday, 19 November 2018 11:06:12 UTC

This archive was generated by hypermail 2.3.1 : Monday, 19 November 2018 11:06:14 UTC