Re: Some interesting things that show up when using a reasoner to classify schema.org from martin.hepp@ebusiness-unibw.org on 2015-02-04 (public-vocabs@w3.org from February 2015)

From: <martin.hepp@ebusiness-unibw.org>
Date: Wed, 4 Feb 2015 11:59:44 +0100
To: "Richard H. McCullough" <rhm@PioneerCA.com>
Cc: Simon Spero <sesuncedu@gmail.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>, Dave Beckett <dave@dajobe.org>
Message-Id: <25B55DCF-CA4B-4F06-852C-F58A4075754B@ebusiness-unibw.org>
Hi Richard, all:

I think the most important question with regard to the meta-model of schema.org is whether we want to continue to reflect a single, consensual conceptual model that defines the set of elements, their granularity, and their semantics based on what search engines can realistically process, or whether we weaken that requirement and go towards a more generically useful set of conceptual elements.

In my opinion, Web developers are adopting schema.org because it is two things in one: A rather generic conceptual model that fits typical information found in Web sites, and a guideline of the type of data that Google, Bing, Yahoo, and Yandex care about (or will in the foreseable future). 

Before schema.org, there was a chaos of vocabularies, with unclear status and relevance. It was hard to find the best elements to mark-up your content in a way any relevant client would understand. While the Semantic Web movement assumes ontology alignment at the point of data consumption, schema.org proposes ontology alignment before data publication.

I know that it is slippery ground to discuss search engines' consumption of schema.org in here, but I think we need to be very clear about the fact that any extension of schema.org must be aligned with what search engines actually use for information extraction. We could spend a decade on discussing ontological details of our world views, but that would be resources wasted for the majority of stakeholders.

If I remember correctly, Guha says in the Ontolog talk [1] that he does not believe one could build meaningful conceptual models completely independent from a notion of the data processing that shall be supported by the data structures.

Of course, this does not rule out to maintain conceptual structures that can be used to improve the generation of a comprehensive documentation for human users (e.g. taxonomic relations) or automated validation of data (e.g. via disjointness axioms and domain/range).

The relationship types you propose might be useful for the latter.

Martin

[1] http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01
An autio recording is here: http://ontolog.cim3.net/file/resource/presentation/Schema.org--RVGuha_20111201/Schema.org_RVGuha_20111201b.mp3




On 03 Feb 2015, at 21:43, Richard H. McCullough <rhm@PioneerCA.com> wrote:

> Hi Martin,
>  
> I just skimmed your paper -- very interesting!
>  
> I think what is necessary is the ability to dynamically
> integrate and differentiate the concept hierarchy,
> i.e., to generalize and specialize the concepts.
>  
> In my work, I focus on the concept hierarchy.
> I have implemented a system with
>  
> two inverse relations
>       iss  --  is a specialization of
>       isg  --  is a generalization of
>  
> a hierarchy outline relation
>       ho  --  list of (level, name) pairs
>              --  U:name denotes universe (top) concept
>              --  u:name denotes unit (bottom) concepts
>  
> differentiation and integration relations which 
> dynamically change the concept hierarchy
>        isd  --  is the differentiation (specialization) of 
>        isi  --  is the integration (generalization) of
>  
> definitions
>        concept  is  genus  with  differentia
>  
> ambiguity measure
>        ambiguity  =  sum( log( # genus of concept) )
>  
> Details are available at http://ContextKnowledgeSystems.org
> 
> Dick McCullough 
> Context Knowledge Systems
> What is your view?
> 
> 
> > Subject: Re: Some interesting things that show up when using a reasoner to classify schema.org
> > From: martin.hepp@ebusiness-unibw.org
> > Date: Tue, 27 Jan 2015 22:04:14 +0100
> > CC: sesuncedu@gmail.com; public-vocabs@w3.org
> > To: rhm@pioneerca.com
> > 
> > Dear Dick:
> > 
> > On 26 Jan 2015, at 15:21, Richard H. McCullough <rhmccullough@att.net> wrote:
> > 
> > > Martin
> > > I enthusiastically agree that users should be able to use these vocabularies without a deep understanding.
> > > As a very interested and naïve user, the size of the vocabulary worries me. I find it difficult to orient myself
> > > and choose the right level and the right terms which are appropriate for my application.
> > > 
> > > Dick McCullough 
> > > Context Knowledge Systems 
> > > What is your view? 
> > 
> > I think we have only two means for keeping schema.org useable for a large audience:
> > 
> > 1. Modularization, i.e. at least make a clear separation between 
> > a) the meta-model and architecture of the vocabulary and
> > b) the domain-specific parts
> > 
> > but maybe even further,
> > 
> > and
> > 
> > 2. Strive for a self-contained, frame-based organisation, i.e. reducing the relevance of the type hierarchy, eventually up to a point where we (publicly) just have a flat bag of types and associated properties.
> > 
> > That does not mean we abandon the hierarchy internally; it will remain useful for managing the vocabulary.
> > 
> > Currently, users and people who want to propose extensions must understand the official and inofficial parts of the meta-model and memorize the type hierarchy.
> > 
> > See Figure 4 in this paper:
> > 
> > Possible Ontologies: How Reality Constrains the Development of Relevant Ontologies, in: IEEE Internet Computing, Vol. 11, No. 1, pp. 90-96, January-February 2007
> > 
> > A PDF is at http://www.heppnetz.de/files/IEEE-IC-PossibleOntologies-published.pdf
> > 
> > 
> > Best wishes 
> > Martin
> >
Received on Wednesday, 4 February 2015 11:00:12 UTC