- From: Casey Jordan <casey.jordan@jorsek.com>
- Date: Fri, 30 Apr 2010 15:39:58 -0400
- To: Kevin Braun <kbraun@obj-sys.com>, xmlschema-dev@w3.org
- Message-ID: <q2zbf585bb61004301239h8f5c4f5aw658ec60c105a57b1@mail.gmail.com>
Kevin, Thank you, for the quick reply. I have read the spec several times now and still have some of the same questions as you do. However, since I have not been able to get in contact with the editors of the spec I have made an logical assumption as you did. My assumption like yours is based off of the question "What would a user of an editor want to receive". Most likely they are going to want to know what they can add, remove or move next. In my opinion, exposed interfaces like allowedNextSiblings should supply a quick way to read the PSVI and give the user options as to what they can edit next and how. If this is not the case, I need to implement features like this for my parent project, ( which is an editor ) anyway. I guess the next big question is, in situations like you and I have outlined, how do we determine the attribution of a particle efficiently? Right now, I validate the document using a DFSA method, and as I do that I build the PSVI, however based on these "fuzzy" attributions I may have elements that if added will change the attribution and make the document invalid. Thus to be totally sure that my "allowedNextSiblings" are accurate I would need to actually insert them into the particle being validated and double check. This would be an efficiency nightmare from the standpoint of a web based editor. I feel like there has to be an elegant solution here, I just haven't written anything like this before. I am just hoping someone with a little more experience here might be able to shed some light on the problem. In the meantime I am going to run some tests cases where I use a double pass, and see just how inefficient it might be. Thanks again! On Fri, Apr 30, 2010 at 2:49 PM, Kevin Braun <kbraun@obj-sys.com> wrote: > Hi Casey, > > If I follow you, your question is how to determine, for example, what > should be in the allowedNextSiblings attribute. The description in the DOM3 > Validation Spec (which I am not familiar with) says: > > allowedNextSibling: A NameList, as described in [DOM Level 3 Core<http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/DOM3-Val.html#references-DOMCore>], > of all element information items or wildcards<http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/DOM3-Val.html#validation-VAL-Interfaces-ElementEditVAL>that can be inserted as a next sibling of this element, or > null if this element has no context or schema. Duplicate pairs of > {namespaceURI, name} are eliminated. > > My question is what does it mean to say "Y can be inserted as a next > sibling of X"? Does that mean "what can I change the next sibling into > without making this document invalid", or "what can I insert after X without > changing anything else and still have a valid document" or "according to the > grammar, what are all the things that possibly follow X in any valid > sentence"? For example, suppose you had something like: > > <xs:sequence> > <xs:element name="one"/> > <xs:choice> > <xs:sequence> > <xs:element name="two"/> > <xs:element name="alpha"/> > <xs:element name="three'/> > </xs:sequence> > <xs:sequence> > <xs:element name="A"/> > <xs:element name="alpha"/> > <xs:element name="B'/> > </xs:sequence> > </xs:choice> > </xs:sequence> > > Given <one><two><alpha><three>, the allowedNextSiblings for <one> is > {<two>, <A>}, if you assume any other necessary changes will be made; it is > {} if you assume no other changes will be made; it is {<two>} if you assume > you are talking about replacing the current next sibling. > > In your example, inserting an <h-sub> is valid, it just happens to change > the particle attribution of the section element. It seems it does belong in > h's allowedNextSibling, under any interpretation. What if <h-sub> were > already there? You can't insert another one, so is <h-sub> still in the > allowedNextSiblings? > > I hope that helps some. Perhaps something somewhere better explains what > allowedNextSiblings means, but I didn't see it based on a quick look at the > spec. My guess is it is more along the lines of trying to expose the > aspects of the grammar so as to let an editor give suggestions to a user, > even if making an edit might produce an invalid document (ie, what could > possibly follow, without respect to what actually does follow). > > My apologies if this is completely useless due to my unfamiliarity with the > DOM3 Validation spec. > > Regards, > Kevin > > -- > Objective Systems, Inc. > REAL WORLD ASN.1 AND XML SOLUTIONS > Tel: +1 (484) 875-9841 > Fax: +1 (484) 875-9830 > Toll-free: (877) 307-6855 (USA only)http://www.obj-sys.com > > > > On 4/30/2010 1:18 PM, Casey Jordan wrote: > > Hey guys/gals, > > Micheal Kay suggested that I posted a problem I am having here in the hopes > that someone might be able to help me. > > I am creating an cross browser Open Source implementation of the DOM3 > Validation Spec<http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/DOM3-Val.html>, > at the moment its just a javascript implementation of a XSD validator and > PSVI interface that conforms to the standard. > > I am using a method based on derivatives of regular expressions ( > Deterministic finite automaton ) and have encountered a really tricky > problem which can be shown by the below example: > > Suppose I have a schema with a type like this: > > <xs:complexType name="my.type" mixed="false"> > <xs:sequence> > <xs:element ref="h"/> > <xs:choice> > <xs:element ref="h-sub" maxOccurs="unbounded" /> > <xs:element ref="section" /> > </xs:choice> > <xs:element ref="section" minOccurs="0" maxOccurs="unbounded" > /> > </xs:sequence> > </xs:complexType> > > When using finite automata, and the above pattern, while you can determine > if a document is valid, it would be impossible to determine if a "section" > element belonged to the xs:choice or the xs:sequence making it also > impossible to provide a complete PSVI. > > For instance suppose I wanted to know what could be added to the following > xml fragment governed by this pattern: > > <h> > <section/> > <section/> > > If we assumed that the first <section/> element satisfied the xs:choice, > then all we can do is add more <section/> elements, however if we assume > that both <section/> elements belong to the xs:sequence then its possible to > add an <h-sub/> element after the <h/>. This all becomes extremely complex > as we start nesting more patterns. > > So all that being said, I've been racking my brain trying to determine if > there is an effective way to compute a correct and complete PSVI in a > situation where this occurs. Ideally without having to look ahead and > remaining efficient. > > > More Details - For those interested. > ------------------------- > > First I transform the schema into json , essentially patterns that > represent the FSA. So the above type would become the following particles: > > { > type:'sequence', > minOccurs:0, > maxOccurs:1, > instance:[ > { type: 'element', ref: 'h', minOccurs:0,maxOccurs:1}, > { type: 'choice', minOccurs:0,maxOccurs:1 > instance:[ > { type: 'element', ref: 'h-sub', > minOccurs:0,maxOccurs:1}, > { type: 'element', ref: 'section', > minOccurs:0,maxOccurs:1}, > ] > > } > > ] > } > > > Then to validate a source node I apply a DFSA stepping through the pattern > and matching it to the source instance. Elements that are 'missing' or could > be added are inserted into a PSVI which can be exposed to find out > information like: > > Element.allowedNextSiblings > Element.allowedChildren > Element.allowedFirstChildren > > etc etc. As the spec describes. > > -- > -- > Casey Jordan > Jorsek Software LLC. > "CaseyDJordan" on LinkedIn, Twitter & Facebook > Cell (585) 771 0189 > Office (585) 239 6060 > Jorsek.com > > > This message is intended only for the use of the Addressee(s) and may > contain information that is privileged, confidential, and/or exempt from > disclosure under applicable law. If you are not the intended recipient, > please be advised that any disclosure copying, distribution, or use of > the information contained herein is prohibited. If you have received > this communication in error, please destroy all copies of the message, > whether in electronic or hard copy format, as well as attachments, and > immediately contact the sender by replying to this e-mail or by phone. > Thank you. > > -- -- Casey Jordan Jorsek Software LLC. "CaseyDJordan" on LinkedIn, Twitter & Facebook Cell (585) 771 0189 Office (585) 239 6060 Jorsek.com This message is intended only for the use of the Addressee(s) and may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, please be advised that any disclosure copying, distribution, or use of the information contained herein is prohibited. If you have received this communication in error, please destroy all copies of the message, whether in electronic or hard copy format, as well as attachments, and immediately contact the sender by replying to this e-mail or by phone. Thank you.
Received on Friday, 30 April 2010 19:40:43 UTC