Re: RDF vocabulary scope guidelines from Harry Halpin on 2009-02-07 (semantic-web@w3.org from February 2009)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Sat, 07 Feb 2009 03:22:51 +0000
Cc: "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <498CFE8B.3060600@ibiblio.org>
Richard Newman wrote:
>
> Hi Jiri,
>
> As the author of that ontology, I am in the unique position of being
> able to explain my modelling choices!
>
> I took that approach for two reasons:
>
> 1: precision. By creating my own term, I can define precisely what is
> meant by (for example) "creation" --  is it the moment I choose to add
> a tag, or the time that tag reached some server? Another way of
> phrasing this is that coining a new property or class allows for
> "minimal enforced ambiguity".
In favor of the other opinion, I might add, creating a new term when a
well-known existing term exists creates semantic islands rather than
linked data IMHO.  Also, by not re-using URIs, you lose the ability to
do URI-directed graph merges, which is the real advantage of RDF over
XML or JSON based formats. Otherwise, to be honest, I'd rather work in
JSON or my favorite programming language than RDF.

Now, it's possible sub-class/sub-propertying can somehow get you to do a
one-two-step to infer and then merge. In theory, sounds great. In
practice, I've rarely seen it done.

The *real* problem is, short of using Sindice and poking around, it's
virtually impossible to find URIs for "similar" things on the Semantic
Web, and so people have no idea if they are duplicating URIs or not. So,
I can't blame everyone for creating new URIs. However, I would recommend
at least looking for a well-deployed URIs before creating your own.
DBPedia, SKOS, Dublin Core or FOAF come to mind as the virtually only
real largely deployed RDF vocabularies at least for general-purpose
subject matter, although poking around Sindice can't hurt.
> 2: a related point: by deliberately using a new term, it can be
> specifically and accurately related to other terms in other ontologies
> -- e.g., my taggedOn might be an equivalentProperty to John Smith's
> tagdate, and a subproperty of a generic date property. Under
> inference, all desired knowledge is apparent, without being corralled
> into a not-quite-compatible ontological framework.
See point about inference above :) Sounds good in theory, rarely seen
done usefully, although it features highly in academic papers done by
ontologists often. Anyone got any non-academic cases where this has
actually been done with RDF(S) or OWL?
> The expense of reasoning is a slight discouragement to this approach,
> but I think in general it stands up.
>
> HTH,
> -Richard
>
>
> -- 
> Sent from my iPhone.
>
> On Feb 6, 2009, at 15:24, Jiri Prochazka <ojirio@gmail.com> wrote:
>
>> Hi,
>> I am sure I am not the first one to notice, but I think there is a
>> problem with determining scope when designing a RDF vocabulary. Reuse of
>> well designed, loosely coupled, high cohesion, more general vocabularies
>> versus domain specific vocabularies.
>>
>> Typical example is date of creation. I am writing this largely thanks to
>> this vocabulary: http://www.holygoat.co.uk/projects/tags/
>> It defines class Tagging, which uses properties taggedBy and taggedOn.
>> This is the domain specific approach. The example is:
>>  <http://example.com/blog/post/1> :tag
>>    [ a :Tagging ;
>>      :associatedTag tag:blog, tag:chimpanzee ;
>>      :taggedBy <http://example.com/People/Jim> ;
>>      :taggedOn "2005-03-29T15:24:10Z"^^xsd:date ] .
>>    tag:blog :tagName "blog" .
>>    tag:chimpanzee :tagName "chimpanzee" .
>>
>> But as another alternative I imagine:
>>  { <http://example.com/blog/post/1> :tag tag:blog, tag:chimpanzee . }
>>    time_vocab:createdOn "2005-03-29T15:24:10Z"^^xsd:date ;
>>    author_vocab:author <http://example.com/People/Jim> .
>>  tag:blog :tagName "blog" .
>>  tag:chimpanzee :tagName "chimpanzee" .
>>
>> Where time_vocab and author_vocab talk about RDF resources (graphs in
>> fact) and could be defined in just one RDF resource description
>> vocabulary instead of two.
>> Or another alternative in which time_vocab:createdOn and
>> author_vocab:author have domain rdfs:Class:
>>  <http://example.com/blog/post/1> :tag tag:blog, tag:chimpanzee ;
>>    time_vocab:createdOn "2005-03-29T15:24:10Z"^^xsd:date ;
>>    author_vocab:author <http://example.com/People/Jim> .
>>  tag:blog :tagName "blog" .
>>  tag:chimpanzee :tagName "chimpanzee" .
>>
>> Which of this approaches is recommended and why?
>>
>> I tend to agree more with the more general vocabulary approach. Like you
>> should ask yourself when designing RDF properties "Shouldn't the
>> domain/range of it be some parent class? If yes, does the property fit
>> the scope of this vocabulary? Shouldn't it be in some more general
>> one?", focusing on reuse rather than rely on later linking of
>> vocabularies.
>>
>> If there were any past discussions on this topic, what were the results
>> of it?
>> Is there any vocabulary for rating resources in terms of authenticity
>> (trust) and agreement (truthfulness)? Vocabulary(ies) covering other
>> resource description aspects would be helpful too... (POWDER is so
>> cumbersome)
>>
>> Best regards,
>> Jiri
>>
>
Received on Saturday, 7 February 2009 03:23:23 UTC