Re: ASN.1 => XML Schema questions

G'day,

Dan Connolly wrote:
> 
> "K.Kawaguchi" wrote:
> >
> > > I know that N! alternatives sounds daunting when performing schema
> > > validation -- is this why <all> cannot have repetitions or be nested
> > > within a <sequence> ?
> >
> > There are algorithms that can validate <all> nested within <all>, or
> > whatever (see http://www.thaiopensource.com/relaxng/jing.html for
> > example), but it's just that W3C XML Schema decided not to allow them
> > for some reason.
> 
> I'm pretty sure the reason is that in W3C XML Schema validation,
> the result includes not just a "yes, this is valid"/"no, not valid"
> but also "and this part of the input matched this part of the
> schema" i.e. "it has this type, is associated with this annotation"
> etc.

I do not see why a repetition of <all> could not be parsed correctly, as
well as including the necessary type and schema information.

For example:

<complexType name="T">
<all minOccurs="0" maxOccurs="unbounded">
    <element name="x" type="integer"/>
    <element name="y" type="integer"/>
    <element name="z" type="integer" minOccurs="0"/>
</all>
</complexType>

So given the following input:

<T><x>1</x><y>2</y><z>3</z><z>1</z><y>2</y><x>3</x><y>1</y><x>2</x><y>1</y><z>2</z></T>

We can parse this (and decorate the syntax tree) as follows:

complexType: T
content: list
   item 1 = <x>1</x><y>2</y><z>3</z>
   item 2 = <z>1</z><y>2</y><x>3</x>
   item 3 = <y>1</y><x>2</x>
   item 4 = <y>1</y><z>2</z> <== error: missing element 'x'

I'm sure such reasoning applies to other repetition types as well -- how
does XML Schema handle an unbounded sequence containing a single
element, which is itself unbounded? Is it {x*} or {x}* ?
 
> [I wish we had revised our requirements document to point
> this out more clearly; it's a requirement that motivates
> a lot of decisions that otherwise don't look nice.]

I'm not an XML Schema expert, so there may be other issues I'm unaware
of, but I'm still not convinced it is a problem. I'd like to see some
examples or theoretical details demonstrating why this is not allowed.
 
> > So your options are either
> >
> > - stick to W3C XML Schema and make a compromise by using (a|b)* rather
> >   than (ab|ba)*.

Will not work in the general case.

> > - or switch to another schema language that allows you to express what
> >   you want.

I've already got a schema language (ASN.1) that I want to translate to
XML Schema -- the main sticking point is collections of SET (ie <all>)
without requiring the definition of additional named complex types. 

I've seen proposals that translate the ASN.1 type "SEQUENCE OF T" to the
following XML Schema:

<complexType>
<sequence>
  <element name="Item" type="T" minOccurs="0" maxOccurs="unbounded"/>
</sequence>
</complexType>

ie "<Item><T>...</T></Item><Item><T>...</T></Item>"

Which can lead to name conflicts, especially if there is more than one
SEQUENCE OF ... definition. 

And it gets worse when the target type of the SEQUENCE OF is anonymous
(such as SEQUENCE OF SEQUENCE OF ...) . I won't go into the gory details
here.

Replacing that ugly set-up with <all minOccurs="0"
maxOccurs="unbounded"> would solve these problems, IMHO. The problem is
that XML Schema will not allow it. If there are good theoretical reasons
why, then I'll try another solution or stick with the ugly,
counter-intuitive method already proposed.

But if anyone is serious about translating ASN.1 schema to XML schema
(and why not -- think of all the protocols specified in ASN.1) then
allowing repition of <all> would help immensely.
 
> Keep in mind that the other languages won't give you type/annotation
> info as a result of checking.

I'm not sure I understand this -- other languages are not amenable to
semantic checking?

Cheers,
Geoff 
-- 
Geoffrey Elgey ph: +61-7-38641487  Distributed Systems Technology
Centre  
Security Unit  fax:+61-7-38641282  QUT, Brisbane, Australia   
                   http://www.dstc.edu.au 
DSTC is the Australian W3C Office  email: elgey@dstc.edu.au

Received on Tuesday, 26 June 2001 18:18:36 UTC