W3C home > Mailing lists > Public > xmlschema-dev@w3.org > July 2008

RE: Newbie: Question about XSD

From: Michael Kay <mike@saxonica.com>
Date: Mon, 14 Jul 2008 09:13:12 +0100
To: "'Mathieu Malaterre'" <mathieu.malaterre@gmail.com>, <xmlschema-dev@w3.org>
Message-ID: <22CCFC06FB7643748851CBA43BE8BD2E@Sealion>

>   Please apologize if those are really simple questions. If 
> someone could suggest a beginer tutorial for XSD, I'd appreciate.

There's always more than one answer when designing XML vocabularies, and
none of them is necessarily the best or only correct answer. But I'll tell
you what I would do. This is mainly based on ease of processing and ease of
validation, not necessarily performance or ease of authoring.
> 
> 
> A.
>   In my document I have a set of entries defined by a pair of 
> unsigned short (unique within a document). How would one 
> represent them in XML ?
> 
> 1. Separate them:
> <entry group="0010" element="0010" />
> 2. Group them:
> <entry group-element="00100010" />
> 2. Group them with comma:
> <entry group-element="0010,0010" />

(1) will be easier to validate and easier to process. However, if the
composite syntax is widely used in the user community and in other
applications (as with 2008-12-05) then a composite representation can be
justified.
> 
> 
> B.
>   I have to manipulate entry which value can be multiple, how 
> would do that ?
> 1. As attribute:
> <entry>1,2,3</entry>
> 
> 2. As element:
> <entry>
>   <value>1</value>
>   <value>2</value>
>   <value>3</value>
> </entry>

(2) will be easier to validate and to process.
> 
> C.
>   What if an entry contains other entry ? Should it be a 
> different attribute ?
> 
> 1.
> <entry group-element="1234,5678"> <!-- special group-element value -->
>   <entry group-element="0010,0010">foobar</entry>
> </entry>
> 
> Or:
> 2.
> <sequence group-element="1234,5678">
>   <entry group-element="0010,0010">foobar</entry>
> </sequence>

That's difficult to answer without knowing anything about the semantics.
It's difficult even if you do know the semantics: if you're modelling a file
hierarchy, should you use "file" for the leaf nodes and "folder" for
everything else, or should you use "file" at all levels? Similarly "manager"
and "employee". There's no right answer. 
> 
> 
> D. Is there a way to express that a particular entry (let say
> 0010,0010) must be present, but value is allowed to be empty.

Depends a little what you mean. You can define such a rule in XML Schema 1.0
at the level of types, but not for specific instances - that is, you can't
say that some entries can be empty and others can't, based on their
group-element value. XSD 1.1 allows this, though.
> 
> E. Bonus question (might not be that important):
>   What happen if one of the <entry> contains a jpeg file ? Is 
> there anything that can be done to validate it with some kind 
> of external codec ?

Not within the scope of XSD 1.0 validation. Other validation technologies
can do it. Saxon's subset of XSD 1.1 support available in Saxon-SA 9.1
allows assertions that call out to external Java code, but you're straying
from the standards to do this.

Michael Kay
http://www.saxonica.com/
Received on Monday, 14 July 2008 08:25:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:15:06 GMT