January 2015

Re: comments on section 7.4

From: Carlos Iglesias <contact@carlosiglesias.es>
Date: Fri, 23 Jan 2015 10:27:45 +0100
Message-ID: <CAAa1Xz=0WZdA80fMy2cMVvtwntsD58Fs8xt20uq-wjLAFs24bA@mail.gmail.com>
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: Public DWBP WG <public-dwbp-wg@w3.org>
Good improvements Antoine, but please, still note that a big part of the
section (starting by the section name by itself) is clearly technologically
biassed. As currently looks like a section from the BP for linked data
publishing that BP for publishing data on the web.

I think than all current content is mostly very good and valuable, but we
need still to (1) remove all technologically-specific references for
everywhere that is not an implementation approach section. and (2) complete
with other alternative implementation approaches.


On 23 January 2015 at 01:48, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi João Paulo, Ig,
> Thanks for the comment!
> I have committed a new version that tries to address some of them. See
> reactions below.
>      I would like the first paragraph to be simplified; it would come back
>> in a later version when we have settle the discussion in that other thread
>> (how to get from data representation to vocabularies).
> The wording of that paragraph can be improved and your suggestions have
> helped a lot.
> That said I'm uncomfortable with 'simplification' if it means 'removing
> the examples'. As I've hinted in the other thread, it's very likely that
> whatever term we choose for 'vocabulary', there will be people for which it
> won't be intuitive, and examples will help.
>>     It currently reads:
>>     “Datasets often resort to a range of vocabularies in the data they
>> contain: data is entered or captured in a controlled way, i.e., positions
>> in a data graph (or column in a relationship table) are explicitly defined,
>> the name of a person, the subject of a book, a relationship “knows” between
>> two persons. Additionally, for certain positions, the values used should
>> come from a limited set of pre-existing resources: for example object
>> types, roles of a person, countries in a geographic area, or possible
>> subjects for books. Such vocabularies ensure a level of control,
>> standardization and interoperability in the data. They can also provide a
>> way to easily create richer data. Say, a dataset contains a reference to a
>> concept described in several languages. This reference allows applications
>> to localize their display of their search depending on the language of the
>> user."
>>     In my opinion there are some imprecisions (what are positions in a
>> graph? What is richer data?), so I would prefer the following
>> simplification:
>>     “Data is often represented in a structured way making reference to a
>> range of vocabularies: data is represented in a controlled way, e.g. by
>> defining types of nodes and links in a data graph or types of values for
>> columns in a table. Additionally, the values used may come from a limited
>> set of pre-existing values or resources: for example object types, roles of
>> a person, countries in a geographic area, or possible subjects for books.
>> Such vocabularies ensure a level of control, standardization and
>> interoperability in the data."
>> I think the way you summed up is OK.
> I have taken the new wording for positions in the graphs, columns, etc.
> Much clearer!
>  However, I do not see any problem to keep the second part related to
>> "Richer Data", because an example was given for explaining that.
> I agree, I've kept it.
>>     I would also not like the terms “light-weight” and “heavy-weight”
>> ontologies to be used in the way they are being used. The text currently
>> says that:
>>     "The first means offered byW3C for creating (“light-weight”)
>> ontologies is theRDF Schema <http://www.w3.org/standards/
>> techs/rdf#w3c_all>language. It is possible to define more complex
>> (“heavy-weight”) ontologies with advanced axioms using languages such as
>> The Web Ontology LanguageOWL <http://www.w3.org/standards/
>> techs/owl#w3c_all>.”
>>     There is a lot of literature on ontologies that calls ontologies in
>> OWL "light-weight ontologies", given the low expressiveness of description
>> logics when compared to other approaches for ontology specification (e.g.,
>> first-order logics). Heavyweight ontologies would be formal ontologies
>> written with expressive languages for off-line use (also called “reference
>> ontologies”). See Guizzardi’s thesis for a very good discussion on this:
>> http://www.inf.ufes.br/~gguizzardi/OFSCM.pdf
>>     My suggestion is to replace this text by:
>>     "The first means offered by W3C for creating ontologies is the RDF
>> Schema <http://www.w3.org/standards/techs/rdf#w3c_all> language. It is
>> possible to define more expressive ontologies with additional axioms using
>> languages such as those in The Web Ontology Language OWL <
>> http://www.w3.org/standards/techs/owl#w3c_all> family.”
>> I perfectly understand what you are talking about. And, as you know,
>> there isn't a consensus in the ontology community about the right
>> definition for light-weight and heavy-weight ontologies. For example, you
>> can see Mizoguchi's tutorial [1] about the type of ontologies.
>> [1] http://www.unipamplona.edu.co/unipamplona/portalIG/home_23/
>> recursos/general/06032011/onto_parte1.pdf
>> For this reason, and due the deadline, I think we could jump this
>> conceptual discussion and the way you proposed is quite nice for me.
> I am glad to remove lightweight and heavyweight, really, even though I too
> have seen them applied in the cases that were described in the text.
>>     BP12, possible approach to implementation:
>>     Add that diagrams may also serve the purpose of documenting
>> vocabularies. An example is the use of a subset of UML to represent the W3C
>> Org Ontology. (By the way, we had certain conventions established in GLD to
>> define the UML diagram which could be part of a detailed BP for this.)
>> +1, but I think you are talking about BP11, right?
> Diagrams are an excellent suggestion! More details on GLD conventions (or
> just a pointer) could be helpful indeed, but I don't have time to dig them
> up.
>>     *I would seriously hope that Best Practice 16 is removed altogether.*
>> It has a number of statements with which I strongly disagree, and is too
>> biased against formalization.
>>     It is biased because it says things such as "Unnecessarily complex
>> vocabularies cost more efforts to produce and are less likely to be re-used
>> in other datasets. “ but there is no reference to the other side of the
>> coin, which would be that “overly simplistic vocabularies may fail to
>> establish shared meaning to enable semantic interoperability”.  It is
>> because of the lack of expressiveness of schema languages like XML Schema
>> that we now have RDF(S) and OWL(S)…
>>     It also says that "Resources that are equiped with a strong, formal
>> semantics are less clear (harder to understand) for any data re-user.” I
>> can’t really understand this. It is too strong a generalization. Why would
>> formal semantics be directly opposed to clarity? Formal semantics may help
>> one to establish more precise specifications… which would support
>> establishing the intended meaning of the vocabulary. So the whole point is
>> obviously identifying the right level of formalization for particular tasks
>> (and possibly having a number of related formalisms when one size does not
>> fit all)! And of course presenting the ontology in a way that users can
>> understand it (for example, with diagrams that do not require the user to
>> read through all axioms – again see W3C ORG Ontology for an example).
>> +1. I totally agree with João Paulo about this issue. The level of
>> formalization depends on several aspects, such as the intended audience,
>> domain, kind of use, and so on... We can see different scenarios with
>> different levels of formalization...
>> As we are proposing Best Practices, I think it is very strong to make
>> such a recommendation. For this reason, I agree with João Paulo to remove
>> the BP16.
> This BP is 'do not overformalize vocabularies', it is not 'do never
> formalize vocabularies'! I agree with you formalization is useful in
> general. It's just that it shouldn't be overused.
> The point is indeed to find the right level, and I think it matches pretty
> well Ig's point on audience, domain, kind of use etc.
> I have tried to do some re-wording in the lines you suggest, because I
> believe our perspectives are not fully incompatible. Some of the sentences
> were indeed confusing and Carlo's suggestions helped me a lot to clarify.
> I've even changed the title.
> If you still fully disagree with having the BP included then we should
> remove it. If you agree with the general idea but still dislike the
> expression it would seem fair to keep it but raising a formal issue in the
> document, calling for readers to support or reject the best practice, or
> contribute enhancements.
> Right now I have put an issue on whether the BP should be re-written in a
> more technology neutral way. I really don't have the time to do more today,
> sorry...
> Best,
> Antoine





