W3C home > Mailing lists > Public > xmlschema-dev@w3.org > August 2010

Re: Help on XML Schema generation using XSLT

From: Michael Kay <mike@saxonica.com>
Date: Fri, 20 Aug 2010 09:22:34 +0100
Message-ID: <4C6E3B4A.3010903@saxonica.com>
To: "Cheney, Edward A SSG RES USAR USARC" <austin.cheney@us.army.mil>
CC: Silent lights <silentlights@yahoo.co.uk>, xmlschema-dev@w3.org
On 20/08/2010 00:58, Cheney, Edward A SSG RES USAR USARC wrote:
> Densil,
>
> I would say converting a basic XML document to a schema document is not probable unless there exists a certain quantity of known information
>    


Actually there are a number of tools that do a quite passable job of 
generating a schema from an instance, including my own DTDGenerator from 
many years ago (still available on the Saxon page at Sourceforge). It 
demands some guesswork (or if we want to be more polite, heuristics) but 
it's possible to do a surprisingly good job. For example, my 
DTDGenerator uses ruled like "generate an enumeration type if there are 
less than 20 distinct values and the number of actual values is at least 
ten times the number of distinct values". Of course the inferred schema 
will always be imperfect (it will allow some "invalid" documents, and 
disallow some "valid" ones, where "validity" is in the eye of the user) 
so it will need manual adjustment.

Although there are quite a few such tools around, I'm not aware of any 
that are implemented in XSLT. But I think it would be perfectly 
reasonable to attempt to write one in XSLT.

I've always thought it would be a good idea for such a tool to allow 
multiple source instances to be supplied as input. In practice I've 
handled this by concatenating them within a wrapper element.

Michael Kay
Saxonica
Received on Friday, 20 August 2010 08:23:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:15:57 UTC