W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2000

RE: Is <any> really enough for <any><body>?

From: Aleksi Niemelš <aleksi.niemela@cinnober.com>
Date: Tue, 11 Jan 2000 19:10:31 +0100
Message-ID: <E536C8EE2A1FD31195370008C79FFA1F09B337@world.cinnober.com>
To: "'Arnold, Curt'" <Curt.Arnold@hyprotech.com>
Cc: "'xml-dev@ic.ac.uk'" <xml-dev@ic.ac.uk>, "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
Thank you for enlightening discussion. You made me clear up my thoughts. But
first I'd like to note that anything I propose is not kind of real proposal
to change current WD. Hopefully I make it clear when I'm making real
proposal :).

The difference between documents

    <BinaryOps>
        <own_arbitrary_BinaryOperator>
            <op1/>
            <op2/>
        </own_arbitrary_BinaryOperator>
    </BinaryOps>

and 

    <BinaryOps>
        <BinaryOp name="own_arbitrary_BinaryOperator">
            <op name="op1"/>
            <op name="op2"/>
        </BinaryOp>
    </BinaryOps>

is in the level of structuralization. The first is focusing on the message
while the latter have more structure (with fixed names you can trust).

We can currently support schema for the last one. But then we fix up
possible names of elements.

I like to be able to say something more specific for the first one than
    <element name="BinaryOps">
        <type>
            <any/>
        </type>
    </element>

for several reasons:
1) I'm not saying <any>thing at all (just that there is some XML coming,
don't be scared even while you don't know anything about the monster
beforehand)
1) I know more about the structure
2) I want to describe the required structure (so that others know it too)
3) I want to somebody else to check the user is getting the structure I want
(and I have a feeling that that somebody could (should?) be XML-processor)

When I explained my schema to my colleague, first question he asked was 

"why don't you say
    <element name="elem" type="elem_type">
        <element name="child">
            <type>
                <attribute name="attr" type="attr_datatype"/>
            </type>
        </element>
    </element>

in a way the structure is in the documents you try to describe:

    <elem xschm:type="elem_type">
        <child attr="xschm:attr_datatype"/>
    </elem>

Afterall, it's all about the structure (and typing:).
" (quote ends :)

I couldn't answer.

    - Aleksi


-----Original Message-----
From: Arnold, Curt [mailto:Curt.Arnold@hyprotech.com]
Sent: den 11 januari 2000 18:09
To: 'Aleksi Niemelš'
Cc: 'xml-dev@ic.ac.uk'; 'www-xml-schema-comments@w3.org'
Subject: RE: Is <any> really enough for <any><body>?


Aleksi Niemelš [mailto:aleksi.niemela@cinnober.com] wrote:

> I don't know where to discuss about schemas, I'd like to hear 
> about better place for this discussion.

I don't know any better public forum than xml-dev, but it seems like nobody
wants to get into the nitty-gritty (you know the stuff that makes it
usable).

> How to express some constraints on <any> element? As I see it, 
> specification doesn't provide any ways for this. (Correct me, 
> please!:) 
> 
> What is the right way to express such simple thing as "any element 
> with child X". 
> 

Your examples seem somewhere in the middle between the use cases for
<import>/<include> and <any>.  

With import/include you are aware of some details of the other namespace and
are explicitly saying that specific elements from that namespace can appear
at specific places in content models in this namespace.

With any, you are not aware of the details of the namespace but you are
willing to accept schema-valid elements at specific places in content models
in this namespace.

Your situations seem to be somewhere in the middle where you know enough
information to wildcard, but not enough to include a group.  I'm not sure
how often this middle group would occur to justify enhancements to the any
construct.

In this case, you don't know anything about the namespace other than the
fact that it has elements that apparently contain one of your element's X.
Something like:

<x:tag1>
	<!-- we don't know about y, but allow element by an any content
fragment  -->
	<y:tag2>
	<!-- y knows about us or used an any construct to all x:tag3 to
appear -->
		<x:tag2/>
	</y:tag2>
</x:tag1>

It would of course be directly supportable if you flipped the content model
so it would be:

<x:tag1>
	<x:tag2>
		<y:tag2/>
	</x:tag2>
</x:tag1>

and it avoids an unnecessary shift in namespaces.

> I'd like to reform rule 1 at 3.5 Wildcards (and rest accordingly):
> o Any well-formed XML element item with specified type constraint:
>   any tag, any namespace, any attributes, any content, as long 
>   as it's well-formed and satisfy specified schema's type constraint.
> 
> I can come up with different ways to express same thing:
> <any type='Type'/>
> <element type='Type'/>
> <element name='' type='Type'/>
> 
> But since regular expressions are now introduced, I'd love to use 
> them:
> 
> <element name='*' type='Type'/>
> <element name='*' ref="VaryingName1"/>
> 

I'd like to avoid this, though be tolerable for a validator, it would
complicate schema-aware editors, documentation, etc.  I think that you have
to come down on the side of effectiveness of validation and simplicity of
parser support.  We can always have higher-levels of abstraction and useful
editing shortcuts in our ultimate schema source that we then XSLT to the
schema that the validation uses.  In this case, you could write an XSLT
transform that expands your wildcard element into a choice group.


> Allowing references in the consturct is very important!
> 
> Maybe star wildcard is just enough, but I can make some real use for 
> real patterns too:
> 
> <element name='expect_varying_named_binary_operators'>
>     <element name='(and|or|own.+BinaryOperator)' 
>              type='BinaryOperator'/>
> </element>
> 
> <type name='BinaryOperator'>
>     <element name='*' type='Operand'/>
>     <element name='*' type='Operand'/>
> </type>
> 
> ========
> document_written_by_some_completely_different_user.xml:
> ========
> <expect_varying_named_binary_operators>
>     <own_ultimately_weird_BinaryOperator>
> 	    <op1/>
> 	    <op2/>
>     </own_ultimately_weird_BinaryOperator>
> </expect_varying_named_binary_operators>
> 

This could be cleanly done with something like:

<BinaryOps>
	<BinaryOp>
		<Object class="http://www.xxx.com/myWeirdBinaryOperation"/>
		<Op/>
		<Op/>
	</BinaryOp>
</BinaryOps>

or:

<BinaryOps>
	<BinaryOp operator="xxxns:weirdOperator">
		<Op/>
		<Op/>
	</BinaryOp>
</BinaryOps>

Either of these enforce this structure in this namespace and then allow
alien behavior by little data islands that fit within that structure.
Instead of expecting parts of alien namespaces to adhere to this namespaces
idea of structure.

I think that the current <any> and <anyAttribute> are pretty close to being
able to support the need and their "limitations" cause you to come up with a
better solution.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN
981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer
to OASIS.
Received on Tuesday, 11 January 2000 13:11:07 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:46 GMT