Re: A VoCamp Galway 2008 success story

* On Dec 1, 2008, at 09:31 AM, Richard Cyganiak wrote:
> We chose the current model (ds1 -> containsLinks -> ls -> target ->
> ds2) because we want to record which dataset contains the links. We
> have some use cases that require this. Your proposal (ds1 <- target
> <- ls -> target -> ds2) doesn't capture that bit of information.

It seems to me that François' proposal *does* capture that bit of
information, because the data set which contains the links is distinct
from both ds1 and ds2 -- it is an invisible and enclosing *ds3*.


> Note that *all* the links in the LOD cloud are published as part of
> one of the datasets.

This seems to me to be an error of early practice.

Consider -- I have a data set, ds1, and I *think* that my entities
are owl:sameAs entities in *your* data set, ds2.  So I create a lot
of owl:sameAs triples.  But I'm wrong.

How do you easily and cheaply exclude those triples from your queries,
when the *rest* of the data in my data set is valid and useful?

Consider the next step in the sequence -- *you* have a bunch of
owl:sameAs triples in *your* data set, pointing to entities in ds3.
*Your* owl:sameAs statements are correct -- but now ds1 entities are
incorrectly inferred to be owl:sameAs ds3 entities.  And so on.

This is just as troublesome -- if not more so -- in ontology mapping
as in instance data mapping.

It seems clear to me that interlink data sets (or "link sets" in what
is becoming common parlance) should be entirely distinct from instance
data sets (or "data sets" in now-common parlance).


> I'm also not sure if there is a clear understanding about how to
> publish linksets independently from the datasets on the Web. I don't
> see it being done in practice.

Surprisingly enough, we're still in early days of doing such things --
and the lack of implementation is not an argument in either direction
about validity of such practice.

How do you publish a link set independently of a data set?

You create a new data set, which is comprised entirely of link
statements.  Best case, a distinct link set would be created for each
ds1-to-ds2 pairing, but it might be sufficient to create ted's-ds1-to-
DBpediaLOD (which could then be ignored when/if a more accurate joe's-
ds1-to-DBpediaLOD is released).


> François, can you point us to some examples of linksets that are
> published independently from any of the linked datasets?

There may well be none, at this moment.  However, I say again, that
is not evidence of whether there *should* be any.


> Also, can you present us with your use case that requires exchanging
> descriptions of such linksets? If there is enough interest, we will
> consider a modelling that can be used for both scenarios.


See above.  More details below...

I publish my link set (ds3, also known as ls1) today, based on my
incorrect understanding of ds2's entities relative to ds1's entities.

Tomorrow, I get hit by a bus, and cannot change my link set based on
the explanations sent to me by both data set creators which would have
corrected my understanding.

Next week, someone publishes a new link set (ds4 or ls2), with correct
linkages (say, rather than owl:sameAs, valid owl:subPropertyOf).

When someone wants to work with these two data sets (ds1 and ds2),
how do they know which link set (ds3 or ds4) is more valid?

One hopes, voiD allows description of the new link set, which can say
"ds4 was created after ds3, to correct incorrect assertions made in
ds3" or similar.

Now ... the creator of ds4 might have their own misunderstandings.
Might be creating a link set without consultation with either ds1 or
ds2 creators -- and even without knowing about ds3.  Perhaps ds3 *is*
correct, and it's the newer ds4 which is incorrect.

Perhaps I know the creators of ds3 and ds4, and I know that the latter
tends to go off half-cocked, while the former carefully researches and
considers what they publish.  Perhaps I want to trust ds3 -- without
regard for whatever anyone else may say about it -- and disregard ds4.

Does voiD allow for this?

Apparently not if the links which comprise ds3 or ds4 are included in
ds1 or ds2 -- and therein lies a problem.

Be seeing you,

Ted



-- 
A: Yes.                      http://www.guckes.net/faq/attribution.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?

Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
Evangelism & Support         //        mailto:tthibodeau@openlinksw.com
OpenLink Software, Inc.      //              http://www.openlinksw.com/
                                  http://www.openlinksw.com/weblogs/uda/
OpenLink Blogs              http://www.openlinksw.com/weblogs/virtuoso/
                                http://www.openlinksw.com/blog/~kidehen/
     Universal Data Access and Virtual Database Technology Providers

Received on Wednesday, 3 December 2008 15:00:08 UTC