W3C home > Mailing lists > Public > xmlschema-dev@w3.org > January 2008

RE: Impact of XML on Data Modeling

From: Tsao, Scott <scott.tsao@boeing.com>
Date: Wed, 30 Jan 2008 15:10:53 -0800
Message-ID: <C7A7D8EA54C20744BFF861613617222C06218EA9@XCH-NW-3V1.nw.nos.boeing.com>
To: <abcoates.work@yahoo.co.uk>, <xmlschema-dev@w3.org>
I was actually trying to focus my attention on the choice of XSD based
on this observation:

	XML is a better approach for design and implementation at the
data interchange level, e.g., specifying the interface 'protocol'
between two systems (or applications).  For example, in the SOA world,
XML would be very useful as the common modeling 'language' across the
data and process modeling perspectives.

And it is definitely for "data-oriented" applications.

So, having narrowed the scope of my question, do you have any further


Scott Tsao 
Associate Technical Fellow 
The Boeing Company 

-----Original Message-----
From: Anthony B. Coates (Work) [mailto:abcoates.work@yahoo.co.uk]
Sent: Wednesday, January 30, 2008 1:06 AM
To: xmlschema-dev@w3.org
Subject: Re: Impact of XML on Data Modeling

On Wed, 30 Jan 2008 04:33:28 -0000, Tsao, Scott <scott.tsao@boeing.com>

> If these observations are correct, my next question would be: Is the
> W3C XML Schema the best choice on the market today for data modeling
> in the XML world?  (why or why not)

If your only concern is a single technology, then you can get away with
only using a physical model.  Which is to say that if XML is your only
concern, you could do your data modelling in an XML schema language (and
introducing a logical model might not be very beneficial in practice;
there is a cost to using layered models, and you generally only get a
pay back on that cost (a) if you need to implement the same data model
across multiple technologies, e.g. databases and Java/C# as well as

As for which is best, my personal rule of thumb is that W3C XML Schema
is the best choice where you are dealing with "data-oriented" XML, i.e.
XML where there isn't much mixed content, and the sequencing of XML
child elements within a parent element is often not important to the
interpretation of the data.  By contract, for "document-oriented" XML,
i.e. XML where there is a significant amount of mixed content, and the
sequencing of XML elements is usually important, I would suggest RELAX
NG (but I say that as someone who works almost exclusively in the
"data-oriented" world).

That said, I've worked with customers who have large numbers of complex
W3C XML Schemas, and if there are lots of "includes" and "imports" that
introduce dependencies between those Schemas (as there often are), they 
can become difficult to understand and maintain using XML Schema
When things get to that scale, I find it works better to introduce a
higher-level model of some sort, so that the set of XML Schemas becomes
more like a repository of re-usable XML types.  Some UML tools now do a
good job of this, and I also had a lot of real-world success using IONA
Artix Data Services to create a repository of types from which I
generated hundreds of Schemas which shared types at the repository
level, but didn't 
have and Schema "includes", making them easier to deploy and understand.

Note that this repository isn't a logical model, it's a physical model
that abstracts away one particular physical issue (which type is defined
in which file).

Perhaps that's a long way of saying that for larger scale projects, it
isn't just about the modelling language that you choose, it's also about
your methodology for working with large models with complex
interrelationships between types and other definitions.

Cheers, Tony.
Anthony B. Coates
London, UK
UK: +44 (20) 8816 7700, US: +1 (239) 344 7700
Mobile/Cell: +44 (79) 0543 9026
Received on Wednesday, 30 January 2008 23:11:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:15:45 UTC