- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Mon, 23 Jul 2001 18:45:38 -0600
- To: "Martin Gudgin" <marting@develop.com>
- Cc: "choi jongwon" <jwchoi@digiweb21.com>, "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, <www-xml-schema-comments@w3.org>
At 2001-07-18 18:38, Martin Gudgin wrote: >> Because <choice/> is unsatisfiable. It's like AND and OR considered >> as n-ary functions: >> >> AND() is true ; including <sequence/> as a required part of a >> content model does not change what it validates >> >> OR() is false ; including <choice/> as a required part of a content >> model ensures that it never validates anything >> >> So for <choice/> to be pointless, it must be optional. > >I don't get this at all... I don't understand what you mean by n-ary >functions in this context. It seems very strange that <choice/> is >unsatisfiable. Can you elaborate please? Since Henry is away, I'll see if I can do the job. The operators AND and OR can be treated as binary operators, which is the most common case, or as n-ary operators, which is a standard feature of Lisp, though I have not encountered it elsewhere. Treated as an n-ary operator, AND takes an arbitrary number of arguments, and returns TRUE if and only if all of the arguments are TRUE -- or rather, it returns FALSE if and only if at least one of the arguments is false, and otherwise it returns TRUE. So, if a, b, and c are all TRUE and x, y, and z are all FALSE, we get (and a b c) ==> TRUE (and a b c x y z) ==> FALSE and so on. Notice that these results are (owing to the associative properties of AND) the same as for AND as a binary operator: (a AND (b AND (c AND (x AND (y AND z))))) Similarly, OR returns TRUE if and only if at least one of the arguments is TRUE. (or a b c) ==> TRUE (or a b c x y z) ==> FALSE In some (all?) Lisp dialects, these are implemented as 'short-circuit' expressions: AND evaluates its arguments either until they are all used up, or until it hits the first false one. When it hits a false expression, it returns immediately. OR does the same thing, except that it returns early if it hits a true argument. Consider this description: "AND evaluates arguments one at a time. If it hits a false argument, it returns FALSE. If it hits the end of the arguments (N.B. this does not happen if we have returned FALSE after hitting a false argument), it returns TRUE." And similarly for OR. Now: what happens if the Lisp AND operator is called without arguments? It hits the end of its arguments without having found any false ones, so it returns TRUE. Similarly, if OR is called without arguments, it hits the end of its arguments without having found any true ones, so it returns FALSE. So Lisp programmers, at least, are used to (and) ==> TRUE (or rather, ==> t) (or) ==> FALSE (==> nil) This is not just a quirk of lazy evaluation. In formal systems, it is handy to have special values which can be used to make any operator into the identity operator, so that for any value v, the expression v op magic-value = v For addition, the magic value is 0, for multiplication it's 1. v + 0 = v v * 1 = v For AND, it's TRUE, and for OR, the magic value is FALSE. v AND TRUE = v v OR FALSE = v When operators / functions take arbitrary numbers of arguments, these values may be, and often are, associated with the operator or function in question: when called without any arguments, the + operator in Lisp returns 0, while the * operator returns 1, and 'and' and 'or' return, as mentioned above, t and nil respectively. Formal definitions of regular languages will make clear that for concatenation, the magic value is the empty string, while for disjunction, the magic value is the empty set. N.B. these are not the same thing: if we write '' for the empty string (often formal treatments use epsilon or lambda), and {} for the empty set, then '' denotes the language with one member (namely, the empty string), while {} denotes the language with no members. In translating from vanilla regular expressions to content models, it is useful to associate an empty sequence with the magic value for concatenation, and to say that it is satisfied by any zero-length sequence of children, and to associate the empty choice with the magic value for alternation / disjunction, and to say that it is not satisfied by any sequence of children at all. So <sequence/>, regarded as the definition of a language, denotes the language consisting of the empty string of (child) elements, while <choice/> denotes the language which has no members (the empty language). Writing <xsd:element name="dink"> <xsd:complexType> <xsd:sequence/> </xsd:complexType> </xsd:element> makes sure that 'dink' elements have no children (warning: culture-specific humor may be present here; as a demographic label for childless couples both of whom have a job, 'dink' is short for "double income, no kids", so it's a constitutive feature of dinks that they have no children). By contrast, writing <xsd:element name="elephant-child"> <xsd:complexType> <xsd:sequence/> </xsd:complexType> </xsd:element> ensures that it is not possible for the content model of the elephant-child element ever to be satisfied. (The element is named for the Elephant's Child in Rudyard Kipling's Just So Stories, which had a curiosity which could not be satisfied.) Such a content model is likely to have limited practical use (but see http://lists.w3.org/Archives/Public/w3c-sgml-wg/1996Sep/0049.html for one application of the idea), but in formal contexts it is very handy to have ways of writing down expressions with these characteristics, which is why the Formal Description document long ago found it desirable to introduce special symbols for 'empty choice' and 'empty sequence'. I hope this helps. -Michael Sperberg-McQueen
Received on Monday, 23 July 2001 20:38:19 UTC