- From: Simon Spero <sesuncedu@gmail.com>
- Date: Mon, 30 Jan 2017 09:08:10 -0500
- To: Martin Hepp <mfhepp@gmail.com>
- Cc: "Thuermer G." <gefion.thuermer@soton.ac.uk>, semantic-web@w3.org, "Munson J.E." <J.Munson@soton.ac.uk>
- Message-ID: <CADE8KM6W-dCTSZd4FvLsCFcbXMBiBYxYOV0kymwB7+VmjpWDwQ@mail.gmail.com>
Protégé uses the OWLAPI, and for precisely the problems you mention, the major document output formats now generate generally stable output ordering. This makes it much easier to use common version control systems like git to handle revision control. There is discussion of this in issue #273 on the github repository. https://github.com/owlcs/owlapi/issues/273 There is also a brief summary in one section my OWLED paper from 2015. Using the Gene Ontology as data source, and taking a typical day as example: For the versions committed on 14 Jun 2014, there were 28 insertions and 5 deletions made to the OBO file. This resulted in 164,305 insertions, and 164,256 deletions in the unordered RDF/XML output. With ordered RDF/XML , there were 54 insertions, and 5 deletions. I should note here that some social VCS platforms are overly cautious about displaying diffs of very large files, even when the diffs are tiny. At the time of study, github refused to try ; bitbucket (git mode) would issue an are-you-sure warning, then surprise itself :) I'm not sure about gitlab. I should also note that frame-like formats like Manchester Syntax are much easier to merge if you have several people working on the same source. Modularity is your friend :) Ontology documentation should be about more than just the individual vocabulary terms. You should be aware of what metaphysical choices you are making, and keep track of them as they occur (metaphysics should be avoided as much as possible, but it's important to be able to recognize it so you know what to run away from). This should be part of the meta documentation for the ontology team, and for future maintainers. You should have guidelines for descriptions, scope notes and labels on your vocabulary terms. The Cyc guidelines for comments may be a helpful starting point. Documentation should not be a substitute for axioms, but it may be helpful for the reader to restate what the axioms say. Just like with code and comments, it is critical that any such restatements stay in sync. Documentation should be designed to meet the needs of the people who will be using it. Documentation aimed at the end user should be written for that audience, and just like any other system component, should be properly tested. If some documentation is written for users who will be applying a vocabulary, then you should run tests to see if the vocabulary is being correctly applied. If documentation is written for those who will be consuming uses of the vocabulary, you should test to see if the uses of the vocabulary are being properly understood. If there are problems, you may need to fix the ontology, not the documentation. Don't try to change the way the SME thinks about their area of expertise. Simon On Jan 30, 2017 3:40 AM, "Martin Hepp" <mfhepp@gmail.com> wrote: I would recommend using 1. a syntax for the ontology like N3/Turtle where changes in the conceptual model are more or less directly equivalent to changes in the serialization. A bad example would be RDF/XML auto-generated from a tool like Protégé. At least in earlier times, the serialization in RDF/XML could vary greatly despite only minor changes in the conceptual model, in particular if you used different versions of the tool to generate the code. The underlying reason is that RDF has no defined ordering of statements, so there are many different ways to represent the same RDF graph. 2. a standard version-control system like Git or Mercurial for hosting the code. This allows a very good documentation of the entire evolution of your model, and this is how we do it at schema.org. There are a few problems with this approach, though: 1. You will have to encode the ontology using a source-code editior - no neat GUI etc. While this is straightforward for basic RDFS/OWL ontologies, it is a bit complicated for advanced OWL language elements. 2. If you reorganize the code or make minor syntactical changes (like replacing spaces by tabs or vice versa), you will still see changes in a diff that do not reflect changes in the conceptual model, so you need to be very disciplined when coding. But other than that, I think this is the best way to solve this. For publishing versions of the ontology, you could use the same mechanism as the W3C for versioning technical documents, i.e. - one URI for the current version, like http://foo.org/onto or http://foo.org/onto# and - one URI for each released version, including the date of the release, like http://foo.org/onto/20170130 or http://foo.org/onto/20170130# There are of course many proposals to handle ontology versioning with additional meta-data and tooling; for an overview, see https://scholar.google.com/scholar?hl=en&q=ontology+versioni ng&btnG=&as_sdt=1%2C5&as_sdtp= >From my top-level understanding, however, the current state of the art is limited to maintaining meta-data about the state and evolution of the ontology, while automatic translation between different versions of the same ontology is still very hard. For the pure documentation of the changes, a version-control system does mainly the same job. For an introduction to the problems towards ontology versioning and evolution, read e.g. http://link.springer.com/article/10.1007%2Fs10115-003-0137-2?LI=true Also keep in mind that ontologies are by their very nature approximate specifications of a domain model, so there can be changes in the intended meaning of ontology elements that are not reflected in the axiomatic specification of the ontology. Best wishes Martin ----------------------------------- martin hepp http://www.heppnetz.de mhepp@computer.org @mfhepp > On 30 Jan 2017, at 00:19, Munson J.E. <J.Munson@soton.ac.uk> wrote: > > Dear team > > My name is Jo Munson and I am currently a PhD candidate at the University of Southampton. > We are currently working with an external organisation looking to put a 'real life' ontology together and am writing to ask whether there are any tools / best practices for > versioning and documenting from your perspective (for commercial/public use, not just in a research context). > > Many thanks for your time > > Jo > > Web Science PhD Candidate > University of Southampton > > > > > > >
Received on Monday, 30 January 2017 14:09:14 UTC