Re: comments on section 7.4 from Carlos Iglesias on 2015-01-23 (public-dwbp-wg@w3.org from January 2015)

From: Carlos Iglesias <contact@carlosiglesias.es>
Date: Fri, 23 Jan 2015 14:36:00 +0100
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: Public DWBP WG <public-dwbp-wg@w3.org>
Message-ID: <CAAa1Xz=vog+A-deRSbYhYEooY8c00esHo=EnpShgs24+BemXvQ@mail.gmail.com>
>
> Thanks for the feedback! I have added a new issue about the technology
> bias at the start of the section, I hope it will be alright for now.
>

Thanks Antoine, it is good that current discussion is also reflected in the
document.


Two comments though:
>
> 1.I am not sure we can avoid any technology-specific mentions in the parts
> out of implementation approach. This section is about difficult notions,
> and I believe examples throughout the section will help readers. Of course
> we should try to pick examples from representative technologies, not just
> LD.
>

We should try and rethink about any of those where we will not be finally
able to achieve it, because that won't be a good signal. Technologies
change and evolve and new ones will also arise. If we do a good work BPs
will remain (see WCAG 2 example and how they solve exactly that problem
they had with WCAG 1)



> 2. I don't know a lot of recipes for implementing these BPs outside of the
> LD realm. And I actually doubt it will be easy to find. I mean, LD
> technology has been made to give technical options and best practices to
> solve these issues, hasn't it? If other technology had been particularly
> good at it, the need for LD would have been less good... But of course I
> eager to learn about any solution elsewhere!


Well, as a first shot of examples:

BP11: A good documentation could be just a PDF, a UML model or XML schema,
etc. as well. That's how everybody was (and still is) documenting
everything before the LD era, no? (note that at least as currently the BP
does not require documentation to be machine-readable)

BP12: There are plenty of examples for this outside LD: (HL7-set, DICOM,
GELLO, CCOW, UBL, OCDS, HR-XML, SDMX, GML; KML; SLD; WCS; WFS; WMS;
XBRL...) I think we just need more diversity in the examples here.

BP13: Regardless terminology I think this is already quite neutral

BP14: Only more variety of examples may be necessary here e.g.
https://sites.google.com/site/erwinfolmeronsemanticstandards/list-of-semantic-standards
or http://www.ssi.dk/graphics/standardkatalog/2.0/index.html (and
terminology)

BP15: This is without doubt the most problematic as it highly biassed as
currently (and also being discussed even from the LD perspective). So I
think we may need to rework/transform/drop it.


Best,
 CI.




> Best,
>
> Antoine
>
> On 1/23/15 10:27 AM, Carlos Iglesias wrote:
>
>> Good improvements Antoine, but please, still note that a big part of the
>> section (starting by the section name by itself) is clearly technologically
>> biassed. As currently looks like a section from the BP for linked data
>> publishing that BP for publishing data on the web.
>>
>> I think than all current content is mostly very good and valuable, but we
>> need still to (1) remove all technologically-specific references for
>> everywhere that is not an implementation approach section. and (2) complete
>> with other alternative implementation approaches.
>>
>> Best,
>>   CI.
>>
>> On 23 January 2015 at 01:48, Antoine Isaac <aisaac@few.vu.nl <mailto:
>> aisaac@few.vu.nl>> wrote:
>>
>>     Hi João Paulo, Ig,
>>
>>     Thanks for the comment!
>>     I have committed a new version that tries to address some of them.
>> See reactions below.
>>
>>
>>
>>
>>              I would like the first paragraph to be simplified; it would
>> come back in a later version when we have settle the discussion in that
>> other thread (how to get from data representation to vocabularies).
>>
>>
>>
>>     The wording of that paragraph can be improved and your suggestions
>> have helped a lot.
>>     That said I'm uncomfortable with 'simplification' if it means
>> 'removing the examples'. As I've hinted in the other thread, it's very
>> likely that whatever term we choose for 'vocabulary', there will be people
>> for which it won't be intuitive, and examples will help.
>>
>>
>>
>>
>>              It currently reads:
>>              “Datasets often resort to a range of vocabularies in the
>> data they contain: data is entered or captured in a controlled way, i.e.,
>> positions in a data graph (or column in a relationship table) are
>> explicitly defined, the name of a person, the subject of a book, a
>> relationship “knows” between two persons. Additionally, for certain
>> positions, the values used should come from a limited set of pre-existing
>> resources: for example object types, roles of a person, countries in a
>> geographic area, or possible subjects for books. Such vocabularies ensure a
>> level of control, standardization and interoperability in the data. They
>> can also provide a way to easily create richer data. Say, a dataset
>> contains a reference to a concept described in several languages. This
>> reference allows applications to localize their display of their search
>> depending on the language of the user."
>>
>>              In my opinion there are some imprecisions (what are
>> positions in a graph? What is richer data?), so I would prefer the
>> following simplification:
>>              “Data is often represented in a structured way making
>> reference to a range of vocabularies: data is represented in a controlled
>> way, e.g. by defining types of nodes and links in a data graph or types of
>> values for columns in a table. Additionally, the values used may come from
>> a limited set of pre-existing values or resources: for example object
>> types, roles of a person, countries in a geographic area, or possible
>> subjects for books. Such vocabularies ensure a level of control,
>> standardization and interoperability in the data."
>>
>>
>>         I think the way you summed up is OK.
>>
>>
>>
>>     I have taken the new wording for positions in the graphs, columns,
>> etc. Much clearer!
>>
>>
>>         However, I do not see any problem to keep the second part related
>> to "Richer Data", because an example was given for explaining that.
>>
>>
>>
>>     I agree, I've kept it.
>>
>>
>>              I would also not like the terms “light-weight” and
>> “heavy-weight” ontologies to be used in the way they are being used. The
>> text currently says that:
>>
>>              "The first means offered byW3C for creating (“light-weight”)
>> ontologies is theRDF Schema <http://www.w3.org/standards/_
>> _techs/rdf#w3c_all <http://www.w3.org/standards/techs/rdf#w3c_all>>language.
>> It is possible to define more complex (“heavy-weight”) ontologies with
>> advanced axioms using languages such as The Web Ontology LanguageOWL <
>> http://www.w3.org/standards/__techs/owl#w3c_all <
>> http://www.w3.org/standards/techs/owl#w3c_all>>.”
>>
>>              There is a lot of literature on ontologies that calls
>> ontologies in OWL "light-weight ontologies", given the low expressiveness
>> of description logics when compared to other approaches for ontology
>> specification (e.g., first-order logics). Heavyweight ontologies would be
>> formal ontologies written with expressive languages for off-line use (also
>> called “reference ontologies”). See Guizzardi’s thesis for a very good
>> discussion on this: http://www.inf.ufes.br/~__gguizzardi/OFSCM.pdf <
>> http://www.inf.ufes.br/~gguizzardi/OFSCM.pdf>
>>
>>              My suggestion is to replace this text by:
>>              "The first means offered by W3C for creating ontologies is
>> the RDF Schema <http://www.w3.org/standards/__techs/rdf#w3c_all <
>> http://www.w3.org/standards/techs/rdf#w3c_all>> language. It is possible
>> to define more expressive ontologies with additional axioms using languages
>> such as those in The Web Ontology Language OWL <
>> http://www.w3.org/standards/__techs/owl#w3c_all <
>> http://www.w3.org/standards/techs/owl#w3c_all>> family.”
>>
>>         I perfectly understand what you are talking about. And, as you
>> know, there isn't a consensus in the ontology community about the right
>> definition for light-weight and heavy-weight ontologies. For example, you
>> can see Mizoguchi's tutorial [1] about the type of ontologies.
>>         [1] http://www.unipamplona.edu.co/__unipamplona/portalIG/home_
>> 23/__recursos/general/06032011/__onto_parte1.pdf <
>> http://www.unipamplona.edu.co/unipamplona/portalIG/home_
>> 23/recursos/general/06032011/onto_parte1.pdf>
>>         For this reason, and due the deadline, I think we could jump this
>> conceptual discussion and the way you proposed is quite nice for me.
>>
>>
>>
>>     I am glad to remove lightweight and heavyweight, really, even though
>> I too have seen them applied in the cases that were described in the text.
>>
>>
>>
>>
>>              BP12, possible approach to implementation:
>>              Add that diagrams may also serve the purpose of documenting
>> vocabularies. An example is the use of a subset of UML to represent the W3C
>> Org Ontology. (By the way, we had certain conventions established in GLD to
>> define the UML diagram which could be part of a detailed BP for this.)
>>
>>
>>         +1, but I think you are talking about BP11, right?
>>
>>
>>
>>     Diagrams are an excellent suggestion! More details on GLD conventions
>> (or just a pointer) could be helpful indeed, but I don't have time to dig
>> them up.
>>
>>
>>
>>              *I would seriously hope that Best Practice 16 is removed
>> altogether.* It has a number of statements with which I strongly disagree,
>> and is too biased against formalization.
>>
>>              It is biased because it says things such as "Unnecessarily
>> complex vocabularies cost more efforts to produce and are less likely to be
>> re-used in other datasets. “ but there is no reference to the other side of
>> the coin, which would be that “overly simplistic vocabularies may fail to
>> establish shared meaning to enable semantic interoperability”.  It is
>> because of the lack of expressiveness of schema languages like XML Schema
>> that we now have RDF(S) and OWL(S)…
>>
>>              It also says that "Resources that are equiped with a strong,
>> formal semantics are less clear (harder to understand) for any data
>> re-user.” I can’t really understand this. It is too strong a
>> generalization. Why would formal semantics be directly opposed to clarity?
>> Formal semantics may help one to establish more precise specifications…
>> which would support establishing the intended meaning of the vocabulary. So
>> the whole point is obviously identifying the right level of formalization
>> for particular tasks (and possibly having a number of related formalisms
>> when one size does not fit all)! And of course presenting the ontology in a
>> way that users can understand it (for example, with diagrams that do not
>> require the user to read through all axioms – again see W3C ORG Ontology
>> for an example).
>>
>>
>>         +1. I totally agree with João Paulo about this issue. The level
>> of formalization depends on several aspects, such as the intended audience,
>> domain, kind of use, and so on... We can see different scenarios with
>> different levels of formalization...
>>         As we are proposing Best Practices, I think it is very strong to
>> make such a recommendation. For this reason, I agree with João Paulo to
>> remove the BP16.
>>
>>
>>
>>     This BP is 'do not overformalize vocabularies', it is not 'do never
>> formalize vocabularies'! I agree with you formalization is useful in
>> general. It's just that it shouldn't be overused.
>>     The point is indeed to find the right level, and I think it matches
>> pretty well Ig's point on audience, domain, kind of use etc.
>>
>>     I have tried to do some re-wording in the lines you suggest, because
>> I believe our perspectives are not fully incompatible. Some of the
>> sentences were indeed confusing and Carlo's suggestions helped me a lot to
>> clarify. I've even changed the title.
>>
>>     If you still fully disagree with having the BP included then we
>> should remove it. If you agree with the general idea but still dislike the
>> expression it would seem fair to keep it but raising a formal issue in the
>> document, calling for readers to support or reject the best practice, or
>> contribute enhancements.
>>
>>     Right now I have put an issue on whether the BP should be re-written
>> in a more technology neutral way. I really don't have the time to do more
>> today, sorry...
>>
>>     Best,
>>
>>     Antoine
>>
>>
>>
>>
>> --
>> ---
>>
>> Carlos Iglesias.
>> Internet & Web Consultant.
>> +34 687 917 759
>> contact@carlosiglesias.es <mailto:contact@carlosiglesias.es>
>> @carlosiglesias
>> http://es.linkedin.com/in/carlosiglesiasmoro/en
>>
>
>


-- 
---

Carlos Iglesias.
Internet & Web Consultant.
+34 687 917 759
contact@carlosiglesias.es
@carlosiglesias
http://es.linkedin.com/in/carlosiglesiasmoro/en
Received on Friday, 23 January 2015 13:36:30 UTC