- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Tue, 8 Mar 2011 08:42:12 -0700
- To: www-xml-schema-comments@w3.org
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
On Mar 7, 2011, at 4:55 PM, bugzilla@jessica.w3.org wrote: > http://www.w3.org/Bugs/Public/show_bug.cgi?id=11076 > > --- Comment #4 from Michael Kay <mike@saxonica.com> 2011-03-07 23:55:25 UTC --- > In the proposal, the phrase "and T1.{type definition} and T2.{type definition} > are valid for the same set of input element information items." seems curious - > I don't think a type definition can be valid for some information items and > invalid for others. > > But rather than fix this in the obvious way, I don't think it's helpful to > define equivalence using a condition that isn't actually computable by a > universal Turing machine. I may be forgetting something simple and obvious, but I don't think I'm familiar with a result that says the equivalence of two XSD types is non-computable (as opposed to expensive); can you expound? The difficulty of comparing two types or two type alternatives for equivalence does seem to me to be a good reason for not requiring processors to calculate it, even if it is computable. But is it a sufficient reason for not defining equivalence in the obvious way? > Better to stick with the mechanistic definition, and > allow processors to relax it if they are able to. Under what circumstances should processors be allowed to recognize two types or two type alternatives as equivalent? I think the simplest answer is: when in fact they are equivalent. If we don't allow ourselves to define equivalence to denote the concept we need here, because in the general case it's expensive, how is the text to make clear the motivation for allowing processors to apply more intelligent analysis than the simplistic comparison spelled out as the minimum? In some specs, one could simply say "types T1 and T2 are the same" and have it be clear and meaningful. We cannot do that, because a sufficiently vocal minority of the WG opposed any effort to define component identity clearly. As any reader of our mailing lists can tell, our failure even to attempt to achieve clarity on this fundamental concept of our spec continues to offend and irritate me (and I apologize for my repetitive complaints about it). More important, our failure continues to exact a price because it makes it harder to resolve many of the issues we face. If we aren't to throw up our hands, say "This is a hopeless mess," and give up on 1.1, we need to find some way to address issues like this one without appealing to identity. In cases where one cannot appeal, or does not wish to appeal, to identity, it is usual to appeal instead to isomorphism or equivalence of one form or another. I think we would do well to try to work out suitable notions of equivalence whenever they can help us solve issues, rather than leaving the issues unresolved or resolving them in an unsatisfactory way by insisting that we deal only in identity. It seemed to me particularly helpful here, in delineating just how to tell whether a particular rule in an implementation is safe or not. If we can't appeal to equivalence, either, because our language is too complex for equivalence to be calculated simply, then we have really painted ourselves into an untenable corner. > I would suggest wording along > the lines: > > T1 and T2 are equivalent if all the following conditions are true: > > ... If this is intended as a definition of equivalence, I think it's a waste of a perfectly good term: the property defined does guarantee equivalence in type alternatives, but it is much much narrower and it borders on misleading to call it by the name 'equivalence'. > > A processor MAY also treat T1 and T2 as equivalent in cases where not all the > above conditions are true, provided it can determine that that T1.{test} and > T2.{test} will always evaluate to the same result for any possible element > information item. This seems to be substantively the same as the current appeal to equivalence of the {test} property, with the main differences being that the concept of equivalence is inlined instead of being called out as important, and that instead of allowing processors to vary in how aggressively they can recognize two types as equivalent, we allow them to vary in what they regard as "the same type" and what they regard as "different types" -- which is *definitely* not computable in the general case, because in the general case our spec does not provide an answer. Is there a reason not to simplify this passage by saying that a processor may treat T1 and T2 as equivalent if a random number generator picks an even number between one and ten three times in a row? Which rule is simpler to understand? Easier to calculate? More likely to provide interoperability? Easier to justify? -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Tuesday, 8 March 2011 15:42:43 UTC