- From: Liam R E Quin <liam@w3.org>
- Date: Tue, 19 Apr 2011 20:35:25 -0400
- To: "Costello, Roger L." <costello@mitre.org>
- Cc: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
On Tue, 2011-04-19 at 16:34 -0400, Costello, Roger L. wrote: > [a-z]{10} and [a-z]{20} This is a nonsense in logic (the empty set) since no values have both exactly 10 and exactly 20 characters. I am guessing you really mean "or" - in which case, ([a-z]{10})|([a-z]{20}) would be one way, and [a-z]{10}([a-z]{10})? another, harder to generate automatically. There are many more possible expressions, but the first given above is easiest, and for a Deeply Mystical Reason, the Schema WG did not forbid non-deterministic regular expressions in facets, despite the original claim that the UPA restriction was there in SGML because non-determinism was hard to implement... > Unfortunately, there is no "and" operator in regex. So, any ideas on how to "and" arbitrary regex expressions? There's no easy way; one hard way might be to deconstruct the two regular expressions into non-deterministic finite-state automata and then attempt to generate regular expression notation from the merged automata. The most useful part would be that if you detect an empty intersection, you can short-circuit the process and say that since no value can satisfy the conjunction, there are no valid instances of the type. I don't know if the XSD regular expression language is closed under a (putative) "and" operation. There may not always be a single expression to represent the result. Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/
Received on Wednesday, 20 April 2011 00:35:28 UTC