- From: Matthieu Ricaud-Dussarget <matthieu.ricaud@igs-cp.fr>
- Date: Fri, 20 Aug 2010 12:50:05 +0200
- To: Michael Kay <mike@saxonica.com>
- CC: "Cheney, Edward A SSG RES USAR USARC" <austin.cheney@us.army.mil>, Silent lights <silentlights@yahoo.co.uk>, xmlschema-dev@w3.org
Hi Densil, I think it's a pretty good idea, especially the one to use a collection of xml file to better detect the global XML structure. We can admit the assertion that if one need a schema, it means that we should have more than one file on this type. XSLT, as functionnal language may be quite helpful, and using XSLT2 might help more. And as long as I know, if you use a Java XSLT processor, you can call java classes within your xslt. Then you are no more restricted functionnal/procedural (but well your xslt will not be plateform independant anymore...). Do you already now what kind of "schema" you like to generate ? It might be a very different analyse in case you want to generate a DTD, a W3C-schema or a RelaxNG one. I've done something similar for Relax NG, but it's a dummy XSLT : it only generate elements with attributes in the order then come, no guesswork.. euh heuristics ;) But within the xml input you can add a few lines of RelaxNg which will be reproduced (within the generated elements or inverse). It don't think it's what you want. Its actually just a tool to write schema easyer and faster, because one only interact with the structure logic, not the obvisous things. But you can customize everythings (the more you do, the less automatic it is, and finaly your could almost write the schema yourself in the xml and the tool become unusefull) And at last, I think XMLspy has a W3C-schema generator, I don't know how it is implemented. Jedit has a DTD generator (XML plugin), which might be written in java (or/and XSLT ?) and is open source :) Let us know about this interresting project. Matthieu Ricaud-Dussarget. Le 20/08/2010 10:22, Michael Kay a écrit : > On 20/08/2010 00:58, Cheney, Edward A SSG RES USAR USARC wrote: >> Densil, >> >> I would say converting a basic XML document to a schema document is >> not probable unless there exists a certain quantity of known information > > > Actually there are a number of tools that do a quite passable job of > generating a schema from an instance, including my own DTDGenerator > from many years ago (still available on the Saxon page at > Sourceforge). It demands some guesswork (or if we want to be more > polite, heuristics) but it's possible to do a surprisingly good job. > For example, my DTDGenerator uses ruled like "generate an enumeration > type if there are less than 20 distinct values and the number of > actual values is at least ten times the number of distinct values". Of > course the inferred schema will always be imperfect (it will allow > some "invalid" documents, and disallow some "valid" ones, where > "validity" is in the eye of the user) so it will need manual adjustment. > > Although there are quite a few such tools around, I'm not aware of any > that are implemented in XSLT. But I think it would be perfectly > reasonable to attempt to write one in XSLT. > > I've always thought it would be a good idea for such a tool to allow > multiple source instances to be supplied as input. In practice I've > handled this by concatenating them within a wrapper element. > > Michael Kay > Saxonica > > -- Matthieu Ricaud IGS-CP Service Livre numérique
Received on Friday, 20 August 2010 10:50:45 UTC