W3C home > Mailing lists > Public > xmlschema-dev@w3.org > July 2002

Re: Algorithm for identifying the type of a given element

From: Francis Norton <francis@redrice.com>
Date: Thu, 11 Jul 2002 16:00:28 +0100
Message-ID: <3D2D9D8C.1030702@redrice.com>
To: Martin Bernauer <bernauer@dke.uni-linz.ac.at>
CC: xmlschema-dev@w3.org

See http://www.schemavalid.com/utils/typeTagger.zip for one approach. 
It's not complete but it processes a schema (in XSLT) and builds a 
transform that will identify the type of elements in conforming instance 
documents.

Basically the key to the algorithm is the fact that no two global 
elements may have the same qualified names but different types, and 
likewise for any two elements within a single content model. Given this 
start, you can define a Finite State Machine that tracks the type of all 
named elements starting from the instance root (which must, of course, 
be one of the global elements). This is exactly what TypeTagger does, in 
fact implementing the States using XSLT modes.

This kind of question seems to be occurring often enough that it might 
be worth chucking TypeTagger onto SourceForge - would anyone be interested?

Francis.

Martin Bernauer wrote:

>When given 
>
>a) an XML Schema document S, and 
>b) an XML document plus an element E therein (e.g., by an XPath
>expression),
>
>does anybody know an algorithm to determine the complex type of E as
>defined by S? Or, if E does not have a "directly according" complex
>type, to determine the complex type of the "nearest" ancestor of E that
>has a directly according complex type? Or any algorithm that partly
>deals with this problem?
>
>I guess schema validators might implement such functionality, though I'm
>not sure whether they provide an interface that can be used from
>external programs. Any hints?
>
>Martin
>
>
>  
>

-- 
"Never mind manoeuvre, go straight at 'em." - Admiral Horatio Nelson
Received on Thursday, 11 July 2002 11:03:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:32 GMT