- From: Casey Jordan <casey.jordan@jorsek.com>
- Date: Fri, 30 Apr 2010 13:18:31 -0400
- To: xmlschema-dev@w3.org
- Message-ID: <w2sbf585bb61004301018g79a47116h70047ed8dbb48c69@mail.gmail.com>
Hey guys/gals, Micheal Kay suggested that I posted a problem I am having here in the hopes that someone might be able to help me. I am creating an cross browser Open Source implementation of the DOM3 Validation Spec<http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/DOM3-Val.html>, at the moment its just a javascript implementation of a XSD validator and PSVI interface that conforms to the standard. I am using a method based on derivatives of regular expressions ( Deterministic finite automaton ) and have encountered a really tricky problem which can be shown by the below example: Suppose I have a schema with a type like this: <xs:complexType name="my.type" mixed="false"> <xs:sequence> <xs:element ref="h"/> <xs:choice> <xs:element ref="h-sub" maxOccurs="unbounded" /> <xs:element ref="section" /> </xs:choice> <xs:element ref="section" minOccurs="0" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> When using finite automata, and the above pattern, while you can determine if a document is valid, it would be impossible to determine if a "section" element belonged to the xs:choice or the xs:sequence making it also impossible to provide a complete PSVI. For instance suppose I wanted to know what could be added to the following xml fragment governed by this pattern: <h> <section/> <section/> If we assumed that the first <section/> element satisfied the xs:choice, then all we can do is add more <section/> elements, however if we assume that both <section/> elements belong to the xs:sequence then its possible to add an <h-sub/> element after the <h/>. This all becomes extremely complex as we start nesting more patterns. So all that being said, I've been racking my brain trying to determine if there is an effective way to compute a correct and complete PSVI in a situation where this occurs. Ideally without having to look ahead and remaining efficient. More Details - For those interested. ------------------------- First I transform the schema into json , essentially patterns that represent the FSA. So the above type would become the following particles: { type:'sequence', minOccurs:0, maxOccurs:1, instance:[ { type: 'element', ref: 'h', minOccurs:0,maxOccurs:1}, { type: 'choice', minOccurs:0,maxOccurs:1 instance:[ { type: 'element', ref: 'h-sub', minOccurs:0,maxOccurs:1}, { type: 'element', ref: 'section', minOccurs:0,maxOccurs:1}, ] } ] } Then to validate a source node I apply a DFSA stepping through the pattern and matching it to the source instance. Elements that are 'missing' or could be added are inserted into a PSVI which can be exposed to find out information like: Element.allowedNextSiblings Element.allowedChildren Element.allowedFirstChildren etc etc. As the spec describes. -- -- Casey Jordan Jorsek Software LLC. "CaseyDJordan" on LinkedIn, Twitter & Facebook Cell (585) 771 0189 Office (585) 239 6060 Jorsek.com This message is intended only for the use of the Addressee(s) and may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, please be advised that any disclosure copying, distribution, or use of the information contained herein is prohibited. If you have received this communication in error, please destroy all copies of the message, whether in electronic or hard copy format, as well as attachments, and immediately contact the sender by replying to this e-mail or by phone. Thank you.
Received on Friday, 30 April 2010 17:19:48 UTC