W3C home > Mailing lists > Public > xmlschema-dev@w3.org > April 2010

Implementing the DOM3 Val Spec in Javascript, problem with UPA and creating PSVI.

From: Casey Jordan <casey.jordan@jorsek.com>
Date: Fri, 30 Apr 2010 13:18:31 -0400
Message-ID: <w2sbf585bb61004301018g79a47116h70047ed8dbb48c69@mail.gmail.com>
To: xmlschema-dev@w3.org
Hey guys/gals,

Micheal Kay suggested that I posted a problem I am having here in the hopes
that someone might be able to help me.

I am creating an cross browser Open Source implementation of the DOM3
Validation Spec<http://www.w3.org/TR/2004/REC-DOM-Level-3-Val-20040127/DOM3-Val.html>,
at the moment its just a javascript implementation of a XSD validator and
PSVI interface that conforms to the standard.

I am using a method based on derivatives of regular expressions (
Deterministic finite automaton ) and have encountered a really tricky
problem which can be shown by the below example:

Suppose I have a schema with a type like this:

<xs:complexType name="my.type" mixed="false">
        <xs:sequence>
            <xs:element ref="h"/>
            <xs:choice>
                <xs:element ref="h-sub" maxOccurs="unbounded" />
                <xs:element ref="section" />
            </xs:choice>
            <xs:element ref="section" minOccurs="0" maxOccurs="unbounded" />
        </xs:sequence>
    </xs:complexType>

When using finite automata, and the above pattern, while you can determine
if a document is valid, it would be impossible to determine if a "section"
element belonged to the xs:choice or the xs:sequence making it also
impossible to provide a complete PSVI.

For instance suppose I wanted to know what could be added to the following
xml fragment governed by this pattern:

<h>
<section/>
<section/>

If we assumed that the first <section/> element satisfied the xs:choice,
then all we can do is add more <section/> elements, however if we assume
that both <section/> elements belong to the xs:sequence then its possible to
add an <h-sub/> element after the <h/>. This all becomes extremely complex
as we start nesting more patterns.

So all that being said, I've been racking my brain trying to determine if
there is an effective way to compute a correct and complete PSVI in a
situation where this occurs. Ideally without having to look ahead and
remaining efficient.


More Details - For those interested.
-------------------------

First I transform the schema into json , essentially patterns that represent
the FSA. So the above type would become the following particles:

{
   type:'sequence',
   minOccurs:0,
   maxOccurs:1,
   instance:[
      { type: 'element', ref: 'h', minOccurs:0,maxOccurs:1},
      { type: 'choice', minOccurs:0,maxOccurs:1
          instance:[
                          { type: 'element', ref: 'h-sub',
minOccurs:0,maxOccurs:1},
                          { type: 'element', ref: 'section',
minOccurs:0,maxOccurs:1},
                       ]

      }

  ]
}


Then to validate a source node I apply a DFSA stepping through the pattern
and matching it to the source instance. Elements that are 'missing' or could
be added are inserted into a PSVI which can be exposed to find out
information like:

Element.allowedNextSiblings
Element.allowedChildren
Element.allowedFirstChildren

etc etc. As the spec describes.

-- 
--
Casey Jordan
Jorsek Software LLC.
"CaseyDJordan" on LinkedIn, Twitter & Facebook
Cell (585) 771 0189
Office (585) 239 6060
Jorsek.com


This message is intended only for the use of the Addressee(s) and may
contain information that is privileged, confidential, and/or exempt from
disclosure under applicable law.  If you are not the intended recipient,
please be advised that any disclosure  copying, distribution, or use of
the information contained herein is prohibited.  If you have received
this communication in error, please destroy all copies of the message,
whether in electronic or hard copy format, as well as attachments, and
immediately contact the sender by replying to this e-mail or by phone.
Thank you.
Received on Friday, 30 April 2010 17:19:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:15:19 GMT