Re: In RDF what is the best practice to represent data provenance (source)?

Michael,

I will just address one point; I think my answers to the questions at  
the end of your post are clear.

You say that there exist indeed two different kinds of things in a  
domain: entity-like things and relationship-like things. You cite the  
example of natural numbers, which clearly are entity-like.

You are right in the case of mathematics, a domain that comes nicely  
pre-packaged in elements and sets. But many domains are not like  
this. Take, for example, the concept of a "married couple". Is this a  
relationship-like thing that should be modelled as an RDF triple  
connecting two resources? Or perhaps as a distinct resource that has  
connections to the two person resources? Or perhaps as a two-element  
class? The answer is that all these options can be reasonable, it  
depends on the concrete use case. Thus my claim that entity-likeness  
or relationship-likeness is an artifact of modelling and not inherent  
to the world.

Yours,
Richard


On 22 Jan 2007, at 15:10, Michael Schneider wrote:

> Richard Cyganiak wrote on Sun, 21 Jan 2007:
>
>> Michael,
>> You say there is a distinction between "atomic resources" in a   
>> domain, and relationships between them. Such a distinction is   
>> artificial. The "atomic resources" in reality are quite literally  
>> not  atomic, and if you squint the right way, any relationship can  
>> be seen  as a resource of its own. The distinction is just an  
>> artifact of your  modelling.
>
> I am not sure if I correctly understand what you mean by "artificial"
> here, so please correct me, if I miss the point.
>
> I have used the term "resource" here in the same sense as it is  
> used in
> the RDF Semantics spec, which calls everything a "resource", what
> "exists" within the currently regarded domain: things, sets,  
> relations,
> relationships, etc. So, in this regard, any relationship can of  
> course be seen as a resource, simply from definition.
>
> This does not mean, however, that a clear categorical distinction
> between things, relationships, classes, etc. does not exist. Such
> a distinction is an inherent property of the respectively domain, by
> which I interpret my RDF graph. Say, I have the following RDF graph
>
>   G := { nat:three rdf:type nat:PrimeNumbers . }
>
> and use the natural numbers as the interpreting domain. If 'nat:three'
> denotes the natural number "3", and 'nat:PrimeNumers' denotes the  
> set of
> prime numbers, I use a meaningful interpretation of G, (based on the
> semantics of URI 'rdf:type', which is defined by the RDF spec). But if
> 'nat:three' denotes the primes, and 'nat:PrimeNumbers' denotes "3",  
> then
> the interpretation for this graph gets /meaningless/!
>
> This is so, because there is a distinction in the domain of natural  
> numbers between thing-like resources and set-like resources. It is / 
> not me/, who introduces such a categorical distinction,
> when making assertions about some domain. The best I can do is to
> capture and exploit those distinctions by means, which my used
> language (RDF here) provides. The more means my language provides (the
> more expressive it is), the more accurately I am able to model aspects
> of the given domain.
>
>> So, I agree with ChrisR: If you feel the need to make statements   
>> about relationships, then maybe the modelling is not adequate to  
>> your  use case, and the relationship ought to be turned into a  
>> resource of  its own.
>
> The relationship already /is/ a resource of its own. But I want my  
> used
> language, RDF, to provide me some means to not just talk about a
> relationship as a /general/ resource, but more specifically as a
> /relationship-like/ resource, so that I can refer to its further  
> structural aspects (refer to its subject, predicate and object),
> whenever I want.
>
> The proposal of ChrisR is probably the best one available in
> the current situation: If I know that I need to make assertions about
> some relationships, I create a dedicated class R, which is meant to
> contain all those relationships that I am interested in. This works,
> as long as I remember, that the semantics of such a construct is up to
> myself, or up to my application, which processes RDF containing such
> constructs.
>
> Then, all comes down to the question, if we always want to create our
> specialized treatment of relationships on a case-by-case base, or  
> if we
> want to have a general, reusable way to talk about such relationships
> (perhaps supported by reasoners and tools). I would opt for the  
> latter.
>
> To illustrate the problem, let's swap the roles and assume for a  
> moment,
> that the following strange situation holds: There is no explicit  
> support
> for /classes/ within RDF: No special vocabulary, like 'rdf:type', no
> special syntactical constructs, and, most important, no special
> semantics. Let's forget about RDFS and OWL for now, we just regard  
> basic
> RDF here. Having all this not would have the following immediate
> consequences:
>
>   * no RDF collections (rdf:Bag, etc.)
>   * no RDF reification
>
> What would remain were the ability to create all kinds of triple sets,
> where the subject of each triple would always denote some resource,
> the predicate would denote some relation or attribute, and the object
> would denote some resource or datavalue. So, not much lost, from a  
> pure
> RDF point of view!
>
> Now, one day, I, Michael, read a post by you, Richard, where you
> complain, that you do not just want to describe resources by their
> attributes and by their relationships to other resources, but you also
> would like to make assertions about, what classes a given resource
> belongs to. "For example", you ask, "how can I express that the  
> natural
> number "3" is an instance of the set of all prime numbers."
>
> I think a little about this point, and then I answer: "Well, because
> numbers and sets of numbers are both homogeneously regarded as  
> resources
> in RDF, you can just put the following triple in your RDF graph:
>
>   nat:three :instanceOfNaturalNumberSet :PrimeNumbers .
>
> There will be no problem to find a meaningful interpretation for this
> statement (which also makes it true). You, and everyone else, who is
> going to use this RDF graph, just have to always remember, how the
> property ':instanceOfNaturalNumberSet' is meant to be interpreted,  
> that
> its subject always has to denote some natural number, and that its
> object always has to denote some set of numbers."
>
> Now, how would you feel about such a reply? Is there a formal error in
> my argumentation? I do not see any. Nevertheless, this proposal would
> probably sound pretty peculiar, I suppose. Your answer would  
> certainly be, why does RDF not just directly support this  
> categorical distinction between classes and instances, so that  
> everyone can reuse it, instead of
> creating custom properties on a case-by-case base. Your reasoning  
> would be:
>
>   1) this is a very fundamental distinction
>
>   2) there are tons of usecases where this distinction is needed
>
> But, I would, of course, answer, that this would make RDF more  
> complex,
> would probably lead to unforeseeable problems, and so on... ;-)
>
> A similar discussion could also be done for
>
>   * n-ary relationships (currently no direct support,
>     so we need a convention like the one given in the n-ary BP note)
>
>   * Named Graphs
>
> For representing the content of an RDF graph G, for example, why not
> just take some dedicated class C_G (now, we have classes back again in
> RDF!), and interpret all its instances as the triples contained in G?
> For this, we would of course have to assume in each case, that triples
> and graphs are part of the respectively interpreting domain. The  
> "Named
> Graph" proposal just makes this assumption explicit, by providing an
> extension of RDF semantics: The syntactical elements are now always  
> part
> of the interpreting domain (the name of a named graph just denote  
> that named graph)! One could argue, that this would not be such a  
> big win... But you would probably oppose such an argument, and so  
> would I, too.
>
>> Some related advice is found in [1].
>> [1] http://www.w3.org/TR/2004/WD-swbp-n-aryRelations-20040721/
>
> (Current version is: http://www.w3.org/TR/swbp-n-aryRelations/)
>
> You probably mean this "General Issue":
>
>   Issue 1: If property instances can link only two individuals,
>   how do we deal with cases where we need to describe the
>   instances of relations, such as its certainty, strength, etc?
>
> I will not going any deeper in this, because my point of view is  
> hopefully clear now: I would prefer to see direct support in RDF  
> for making assertions about relationships, rather than using any  
> conventions about how to interpret specially formed RDF expressions.
>
> So I would like to ask the community the following questions:
>
>    * Do you think that we need direct support in RDF for
>      making assertions about relationships - in form of
>      extra vocabulary, language constructs and semantics?
>      Or do you think that for the Semantic Web it will suffice
>      to define specifically interpreted relationship classes
>      on a case-by-case scenario?
>
>    * Are there any serious general problems with this complete
>      idea of direct language support for referencing relationships?
>      (as a comparison: I do not easily see how one could directly
>      support n-ary relationships within the RDF framework,
>      in whatever way)
>
>    * Do you see any serious problems in re-interpreting /reification/
>      for this purpose?
>
>
> Best regards,
> Michael
>
>

Received on Monday, 22 January 2007 20:47:08 UTC