Re: The Data Portal use cases (was: Shapes vs Classes (in LDOM)) from Jose Emilio Labra Gayo on 2015-01-26 (public-data-shapes-wg@w3.org from January 2015)

From: Jose Emilio Labra Gayo <jelabra@gmail.com>
Date: Mon, 26 Jan 2015 09:32:50 +0100
To: Holger Knublauch <holger@topquadrant.com>
Cc: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <CAJadXXJyDLNhmD5sw70ps5PvCBBrsq31VkTR4tG7vJps7--_tQ@mail.gmail.com>
>
> In that case, I did the data model and I added rdf types, but I there may
> be plenty of situations and other models where one would not need to add
> those rdf:type declarations and it would not mean that they are not right.
>
>
> This did not strengthen the case for separating classes and shapes yet.
> Your example uses classes too, and looking at your shapes, they all start
> by checking for the presence of an rdf:type triple. So the shapes basically
> just mirror the classes.
>
> Can you provide an example that illustrates your point better?
>
> As I said, in those examples I did the modeling and I tried to add
rdf:type to all the data that was published in the portal. But I think that
example illustrates the point, because if you look at the shape of
observations in the WebIndex and the shape of observations in the
LandPortal they look similar but contain differences. So there are two
different concepts (the shape of observations in WebIndex and the shape of
observations in Landportal). Although in this case, they are related to the
type qb:Observation, there could be other cases where one would not have
such a class at hand and one could just omit the rdf:type declaration, or
even that one could have more than one rdf:type at hand.

You can say that this is hypothetical, but that can occur in practice and
we should handle it.

When you look at data portals you will find lots of data that doesn't have
rdf:type, or that have more than one rdf:type.

What I was trying to illustrate is that forcing the constraints to be
always attached to a particular class is too narrow and we should have
other mechanisms to associate the constraints with the nodes that we want
to validate.

> Notice that although the shapes of qb:Observation's in both portals look
> quite similar (the data model is in fact very similar and was made by the
> same person), in practice it should not necessarily be the case. For
> example, the property used to associate time to observations varied from
> one portal to the other.
>
> This again sounds rather abstract and hypothetical. Can you provide
> specific definitions and/or sample data? A relevant slide seems to be slide
> 19 on http://www.slideshare.net/jelabra/linked-dataquality-2014 which
> shows two uses of the Observation class. One is using a property
> ces:ref-year, the other is using lb:time. Why would it be a problem to
> attach both into the same class Observation? To keep them separate, you
> could save their ldom:property declarations in different files/graphs for
> each portal. Those files would import the base schema. Do you have examples
> for conflicting definitions that would cause issues if those graphs were
> merged? Also, why can't you just create two subclasses of Observation, for
> each of the two portals?
>


Yes, this is hypothetical. The use case was inspired by this story, but
once we published the data portals, our work was done. However, one of the
main reasons to publish a data portal, is that the data can be easily
reused and consumed by third parts.

And I think that is one of the missing pieces of linked data in general,
that it is not easy for third parties to know what data is available behind
data portals and sparql endpoints. One motivating scenarios and probably
another user story is when a third party wants to consume data from other
data portals.

In this case, the idea that both portals contain statistical data from a
very similar domain following the same RDF Data cube vocabulary could
easily lead to some company to create a visualization tool aggregating the
data from those data portals...that tool could take the shapes descriptions
and adapt its behavior according to them.

I proposed this as a user story because it covers an important aspect of
linked data applications which is inspired by a real example. You may say
that it is hypothetical, but I think user stories are hypothetical by
nature...they can be inspired by practical examples, like this one, but
they offer some hypothetical scenario, which this one does, and it also
offers some challenges that I think will benefit the WG.

Best regards, Jose Labra

>
> Thanks
> Holger
>
>
>
>  [1] [1] Validating and Describing Linked Data Portals using RDF Shape
> Expressions, Jose Emilio Labra Gayo, Eric Prud'hommeaux, Harold Solbrig,
>  1st Workshop on Linked Data Quality, Sept. 2014, Leipzig, Germany
> PDF: http://labra.github.io/ShExcala/papers/ldq2014.pdf
> Slides: http://www.slideshare.net/jelabra/linked-dataquality-2014
>
>  [2] http://weso.github.io/wiDoc/
> [3] http://weso.github.io/landportalDoc/data/
>
>
>


-- 
Saludos, Labra
Received on Monday, 26 January 2015 08:33:38 UTC