Re: comments on section 7.4 from Ig Ibert Bittencourt on 2015-01-22 (public-dwbp-wg@w3.org from January 2015)

From: Ig Ibert Bittencourt <ig.ibert@gmail.com>
Date: Thu, 22 Jan 2015 15:37:13 -0200
To: João Paulo Almeida <jpalmeida@ieee.org>
Cc: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Message-ID: <CAKNDvRV4jSm5Wevgx3J6s_k+87jwXpU9e82ZaoOtAqUUCjVmuA@mail.gmail.com>
Hi Everyone,

Sorry for not been discussing before, but I was on vacations.

With regards João Paulo's Concerns, please find my comments inline.

Best,
Ig

2015-01-22 14:42 GMT-02:00 João Paulo Almeida <jpalmeida@ieee.org>:

> Dear All,
>
> I understand Carlos concerns that we do not have time for a full
> discussion of the concepts underlying the BP document, but I would not like
> section 7.4 to be sent “out there” in its present form.
>
> I would like the first paragraph to be simplified; it would come back in a
> later version when we have settle the discussion in that other thread (how
> to get from data representation to vocabularies).
>
> It currently reads:
> “Datasets often resort to a range of vocabularies in the data they
> contain: data is entered or captured in a controlled way, i.e., positions
> in a data graph (or column in a relationship table) are explicitly defined,
> the name of a person, the subject of a book, a relationship “knows” between
> two persons. Additionally, for certain positions, the values used should
> come from a limited set of pre-existing resources: for example object
> types, roles of a person, countries in a geographic area, or possible
> subjects for books. Such vocabularies ensure a level of control,
> standardization and interoperability in the data. They can also provide a
> way to easily create richer data. Say, a dataset contains a reference to a
> concept described in several languages. This reference allows applications
> to localize their display of their search depending on the language of the
> user."
>
> In my opinion there are some imprecisions (what are positions in a graph?
> What is richer data?), so I would prefer the following simplification:
> “Data is often represented in a structured way making reference to a range
> of vocabularies: data is represented in a controlled way, e.g. by defining
> types of nodes and links in a data graph or types of values for columns in
> a table. Additionally, the values used may come from a limited set of
> pre-existing values or resources: for example object types, roles of a
> person, countries in a geographic area, or possible subjects for books.
> Such vocabularies ensure a level of control, standardization and
> interoperability in the data."
>

I think the way you summed up is OK. However, I do not see any problem to
keep the second part related to "Richer Data", because an example was given
for explaining that.

"[...] They can also provide a way to easily create richer data. Say, a
dataset contains a reference to a concept described in several languages.
This reference allows applications to localize their display of their
search depending on the language of the user."



>
> I would also not like the terms “light-weight” and “heavy-weight”
> ontologies to be used in the way they are being used. The text currently
> says that:
>
> "The first means offered by W3C for creating (“light-weight”) ontologies
> is the RDF Schema <http://www.w3.org/standards/techs/rdf#w3c_all> language.
> It is possible to define more complex (“heavy-weight”) ontologies with
> advanced axioms using languages such as The Web Ontology Language OWL
> <http://www.w3.org/standards/techs/owl#w3c_all>.”
>
> There is a lot of literature on ontologies that calls ontologies in OWL
> "light-weight ontologies", given the low expressiveness of description
> logics when compared to other approaches for ontology specification (e.g.,
> first-order logics). Heavyweight ontologies would be formal ontologies
> written with expressive languages for off-line use (also called “reference
> ontologies”). See Guizzardi’s thesis for a very good discussion on this:
> http://www.inf.ufes.br/~gguizzardi/OFSCM.pdf
>
> My suggestion is to replace this text by:
> "The first means offered by W3C for creating ontologies is the RDF Schema
> <http://www.w3.org/standards/techs/rdf#w3c_all> language. It is possible
> to define more expressive ontologies with additional axioms using languages
> such as those in The Web Ontology Language OWL
> <http://www.w3.org/standards/techs/owl#w3c_all> family.”
>


I perfectly understand what you are talking about. And, as you know, there
isn't a consensus in the ontology community about the right definition for
light-weight and heavy-weight ontologies. For example, you can see
Mizoguchi's tutorial [1] about the type of ontologies.

[1]
http://www.unipamplona.edu.co/unipamplona/portalIG/home_23/recursos/general/06032011/onto_parte1.pdf

For this reason, and due the deadline, I think we could jump this
conceptual discussion and the way you proposed is quite nice for me. Your
suggestion removed the light-weight and heavy-weight terms from the
definition.


>
> BP12, possible approach to implementation:
> Add that diagrams may also serve the purpose of documenting vocabularies.
> An example is the use of a subset of UML to represent the W3C Org Ontology.
> (By the way, we had certain conventions established in GLD to define the
> UML diagram which could be part of a detailed BP for this.)
>

+1, but I think you are talking about BP11, right?


>
> *I would seriously hope that Best Practice 16 is removed altogether.* It
> has a number of statements with which I strongly disagree, and is too
> biased against formalization.
>
> It is biased because it says things such as "Unnecessarily complex
> vocabularies cost more efforts to produce and are less likely to be re-used
> in other datasets. “ but there is no reference to the other side of the
> coin, which would be that “overly simplistic vocabularies may fail to
> establish shared meaning to enable semantic interoperability”.  It is
> because of the lack of expressiveness of schema languages like XML Schema
> that we now have RDF(S) and OWL(S)…
>
> It also says that "Resources that are equiped with a strong, formal
> semantics are less clear (harder to understand) for any data re-user.” I
> can’t really understand this. It is too strong a generalization. Why would
> formal semantics be directly opposed to clarity? Formal semantics may help
> one to establish more precise specifications… which would support
> establishing the intended meaning of the vocabulary. So the whole point is
> obviously identifying the right level of formalization for particular tasks
> (and possibly having a number of related formalisms when one size does not
> fit all)! And of course presenting the ontology in a way that users can
> understand it (for example, with diagrams that do not require the user to
> read through all axioms – again see W3C ORG Ontology for an example).
>

+1. I totally agree with João Paulo about this issue. The level of
formalization depends on several aspects, such as the intended audience,
domain, kind of use, and so on... We can see different scenarios with
different levels of formalization...

As we are proposing Best Practices, I think it is very strong to make such
a recommendation. For this reason, I agree with João Paulo to remove the
BP16.


>
> Best regards,
> João Paulo
>
>
>
>
>
>
>
>
>
>


-- 

Ig Ibert Bittencourt
Professor Adjunto III - Instituto de Computação/Universidade Federal de
Alagoas (UFAL)
Vice-Coordenador da Comissão Especial de Informática na Educação
Líder do Centro de Excelência em Tecnologias Sociais
Co-fundador da Startup MeuTutor Soluções Educacionais LTDA.
Received on Thursday, 22 January 2015 17:38:02 UTC