W3C home > Mailing lists > Public > xmlschema-dev@w3.org > May 2010

Re: Implementing the DOM3 Val Spec in Javascript, problem with UPA and creating PSVI.

From: Casey Jordan <casey.jordan@jorsek.com>
Date: Sun, 2 May 2010 11:38:19 -0400
Message-ID: <q2xbf585bb61005020838mcd7d54a5wf76f228b2ffb6c8d@mail.gmail.com>
To: Michael Kay <mike@saxonica.com>
Cc: xmlschema-dev@w3.org

I agree, this situation does not violate the UPA rule, I suppose I should
have been more specific.

My question revolves around creating a PSVI from instances like this. Lets
look at the following example based off this schema with possible insertions
noted in [element_name].

Example: This is what I would get by stepping through the pattern and
marking elements that can be inserted.

section <---------Satisfies the choice particle, however I could technically
insert a <section/> above this.

Heres the issue, given this schema I should also be able to insert an
<section/> element after the h-sub, but its not really possible to know that
unless we look ahead, especially if the <section/> element in the sequence
has a maxOccurs that is defined other than unbounded. This problem gets
really complex when structures get deeply nested.

Kevin Braun summed the problem up here using regular expressions:

*Hi Casey,

Just using reg exprs for convenience, suppose you have a grammar:

Sentence ::= 'Z' ( 'a' 'b'+ End | '1' 'b'+ '2' )
End ::= 'c' | '2'

Then consider these sentences:
Zabbbbbbbbbbbc and Zabbbbbbbb2.  In the first case, the 'a' cannot be
replaced because of the 'c' on the end.  In the other case, the 'a' may be
replaced with a '1', since there is a '2' on the end.  You can't determine
this without looking to the end of the potentially infinite string.

You can, however, figure out that a 'Z' may be followed by either 'a' or '1'
(there are sentences in which this occurs).  This is what is called a follow
set (as you probably know).

I would think that as I edit a document, if the editor is going to make
suggestions, it would suggest an 'a' and a '1' after a 'Z', and then mark
what is wrong, if something becomes wrong, after I make the edit.

Good luck!*

So, based on this I am left with a situation where in order to determine
where elements can actually be inserted into the document I have to do the

1.) Assemble all possible elements that can be inserted before any given
element or appended to any given element.

2.) Insert these elements one by one into the instance and re-validate the
particle, if it fails validation, throw it out.

3.) Return all element names that did not cause the document to become
invalid on insertion.

This leaves me in a tricky spot since I am doing this all in JavaScript, and
this process could get really inefficient. I have tried to find an algorithm
that would allow me to do this more efficiently but haven't found anything.

Is there a standard way for creating a PSVI when using a FSA method that I
am missing? Or am I on the right track?

Thanks guys.



On Sat, May 1, 2010 at 6:45 AM, Michael Kay <mike@saxonica.com> wrote:

>  > Suppose I have a schema with a type like this:
> <xs:complexType name="my.type" mixed="false">
>         <xs:sequence>
>             <xs:element ref="h"/>
>             <xs:choice>
>                 <xs:element ref="h-sub" maxOccurs="unbounded" />
>                 <xs:element ref="section" />
>             </xs:choice>
>             <xs:element ref="section" minOccurs="0" maxOccurs="unbounded"
> />
>         </xs:sequence>
>     </xs:complexType>
>  > When using finite automata, and the above pattern, while you can
> determine if a document is valid, it would be impossible to determine if a
> "section" element belonged to the xs:choice or the xs:sequence making it
> also impossible to provide a complete PSVI.
> I'm having difficulty seeing the problem. A <section> that immediately
> follows the <h> can only satisfy the choice. A <section> that immediately
> follows an <h-sub> or another <section> can only satisfy the final particle.
> If the choice were optional or repeatable, this content model would violate
> UPA. (Though Saxon would actually allow it through, since Saxon only
> attributes element instances to declarations, not to particles, and in this
> case the two particles refer to the same element declaration.)
> Regards,
> Michael Kay
> http://www.saxonica.com/
> http://twitter.com/michaelhkay

Casey Jordan
Jorsek Software LLC.
"CaseyDJordan" on LinkedIn, Twitter & Facebook
Cell (585) 771 0189
Office (585) 239 6060

This message is intended only for the use of the Addressee(s) and may
contain information that is privileged, confidential, and/or exempt from
disclosure under applicable law.  If you are not the intended recipient,
please be advised that any disclosure  copying, distribution, or use of
the information contained herein is prohibited.  If you have received
this communication in error, please destroy all copies of the message,
whether in electronic or hard copy format, as well as attachments, and
immediately contact the sender by replying to this e-mail or by phone.
Thank you.
Received on Sunday, 2 May 2010 15:38:52 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:56:17 UTC