- From: Joćo Paulo Almeida <jpalmeida@ieee.org>
- Date: Thu, 22 Jan 2015 14:42:26 -0200
- To: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
- Message-ID: <D0E6BCD2.99ED0%jpalmeida@ieee.org>
Dear All, I understand Carlos concerns that we do not have time for a full discussion of the concepts underlying the BP document, but I would not like section 7.4 to be sent ³out there² in its present form. I would like the first paragraph to be simplified; it would come back in a later version when we have settle the discussion in that other thread (how to get from data representation to vocabularies). It currently reads: ³Datasets often resort to a range of vocabularies in the data they contain: data is entered or captured in a controlled way, i.e., positions in a data graph (or column in a relationship table) are explicitly defined, the name of a person, the subject of a book, a relationship ³knows² between two persons. Additionally, for certain positions, the values used should come from a limited set of pre-existing resources: for example object types, roles of a person, countries in a geographic area, or possible subjects for books. Such vocabularies ensure a level of control, standardization and interoperability in the data. They can also provide a way to easily create richer data. Say, a dataset contains a reference to a concept described in several languages. This reference allows applications to localize their display of their search depending on the language of the user." In my opinion there are some imprecisions (what are positions in a graph? What is richer data?), so I would prefer the following simplification: ³Data is often represented in a structured way making reference to a range of vocabularies: data is represented in a controlled way, e.g. by defining types of nodes and links in a data graph or types of values for columns in a table. Additionally, the values used may come from a limited set of pre-existing values or resources: for example object types, roles of a person, countries in a geographic area, or possible subjects for books. Such vocabularies ensure a level of control, standardization and interoperability in the data." I would also not like the terms ³light-weight² and ³heavy-weight² ontologies to be used in the way they are being used. The text currently says that: "The first means offered by W3C for creating (³light-weight²) ontologies is the RDF Schema <http://www.w3.org/standards/techs/rdf#w3c_all> language. It is possible to define more complex (³heavy-weight²) ontologies with advanced axioms using languages such as The Web Ontology Language OWL <http://www.w3.org/standards/techs/owl#w3c_all> .² There is a lot of literature on ontologies that calls ontologies in OWL "light-weight ontologies", given the low expressiveness of description logics when compared to other approaches for ontology specification (e.g., first-order logics). Heavyweight ontologies would be formal ontologies written with expressive languages for off-line use (also called ³reference ontologies²). See Guizzardi¹s thesis for a very good discussion on this: http://www.inf.ufes.br/~gguizzardi/OFSCM.pdf My suggestion is to replace this text by: "The first means offered by W3C for creating ontologies is the RDF Schema <http://www.w3.org/standards/techs/rdf#w3c_all> language. It is possible to define more expressive ontologies with additional axioms using languages such as those in The Web Ontology Language OWL <http://www.w3.org/standards/techs/owl#w3c_all> family.² BP12, possible approach to implementation: Add that diagrams may also serve the purpose of documenting vocabularies. An example is the use of a subset of UML to represent the W3C Org Ontology. (By the way, we had certain conventions established in GLD to define the UML diagram which could be part of a detailed BP for this.) I would seriously hope that Best Practice 16 is removed altogether. It has a number of statements with which I strongly disagree, and is too biased against formalization. It is biased because it says things such as "Unnecessarily complex vocabularies cost more efforts to produce and are less likely to be re-used in other datasets. ³ but there is no reference to the other side of the coin, which would be that ³overly simplistic vocabularies may fail to establish shared meaning to enable semantic interoperability². It is because of the lack of expressiveness of schema languages like XML Schema that we now have RDF(S) and OWL(S)Š It also says that "Resources that are equiped with a strong, formal semantics are less clear (harder to understand) for any data re-user.² I can¹t really understand this. It is too strong a generalization. Why would formal semantics be directly opposed to clarity? Formal semantics may help one to establish more precise specificationsŠ which would support establishing the intended meaning of the vocabulary. So the whole point is obviously identifying the right level of formalization for particular tasks (and possibly having a number of related formalisms when one size does not fit all)! And of course presenting the ontology in a way that users can understand it (for example, with diagrams that do not require the user to read through all axioms again see W3C ORG Ontology for an example). Best regards, Joćo Paulo
Received on Thursday, 22 January 2015 16:42:56 UTC