Re: cannonical ordering in the XML

On Thursday 13 September 2007, Bijan Parsia wrote:

> >> Have I missed the bit that talks about cannonical orderings? Is
> >> there an
> >> expected (although not required) order in which we should
> >> serialize elements
> >> to xml?
>
> No though one could emerge and tools could converge.
>
> > I don't think there is such a document.
>
> There isn't. There's actually two or three places it could go: e.g.,
> In the functional syntax document or in the XML syntax document. Or
> in a canonicalization document.

I would favor it in a cannonicalisation document. However, no firm thoughts.

> > Would you be able to document your approach and post it to the
> > list?  (I could then add this to the OWL API, which already has an
> > axiom comparator, but it would be great if there was a more
> > standard approach).

Sure. Well, the long and the short of it was that I just derived Ord for each 
structure representing each bit of owl syntax (similar to having a compiler 
fill in a bog-standard instance of Comparable in Java). However, for sanity, 
within my API I needed to inject a few extra levels of structure e.g. 
grouping all the object property axioms together, all the data property 
axioms and so on, so this implementation detail spills over into the ordering 
e.g. all data property axioms are sorted earlier in the file to object 
property axioms. Things with multiple fields are compared field-by-field in 
the order those fields are encoded into the xml. Strings are alphanumerical, 
numbers just use <.

This is not really an ideal situation - it exposes too much implementation- 
and language-specific stuff in the ordering. I'd propose that we pick 
something very simple e.g. sort by element name first, and by content 
seccond, and stick to that? Oh, and pick a definition of ordering for sets 
that perhaps is the same as an ordering over its view as a sorted list,  with 
earlier elements being more significant than later ones, and if one list runs 
out of elements first, it is the lower ordered. However, if you think that 
having groupings such as all set axioms together, all object property axioms 
together and so on, I'm good with that as long as these groupings of axioms 
are clearly stated, and reflected in the UML diagram propper.

> Indeed. The approach currently taken in the structural spec is a bit
> "abstract" with regard to concrete documents in that it uses sets
> quite a bit (e.g., an ontology is a set of axioms). I think being a
> bit more specific on parsing, comparison of parses, and serialization
> is a good thing for the WG to do. Any first steps toward this are
> more than welcome.

The abstractness of the spec is a good thing - I was usually only bitten by 
the spec where it failed to be abstract enough (e.g. specifing String where 
some more abstract uri type would have been better), not the other way 
arround. However, once you get away from your internal data structures and on 
to data exchange, things do need nailing down.

>
> Cheers,
> Bijan.


Matthew

Received on Thursday, 13 September 2007 12:06:39 UTC