- From: Gregor <iamgregor@gmail.com>
- Date: 09 May 2005 11:50:49 -0600
- To: www-xml-schema-comments@w3.org
I do not understand the nature of the XML Schema language. "...The approach followed here follows the best practices currently used in the programming languages community, although somewhat adapted for XML. The hallmark of this approach is the use of context free grammars to provide syntactic checking and the use of inference rules to provide the semantics associated with each piece of syntax. This means there is, essentially, one inference rule per context free grammar production. This set of inference rules is not intended to be in any way minimal, but it is helpful from both a pedagogical and implementation standpoint - for each syntactic construct it is straightforward to identify its underlying semantics." http://www.w3.org/TR/2001/WD-xmlschema-formal-20010320/ I think I understood the part about it being a context free grammar. To my understanding, this is shown by the production rules in the recommendation as in: <element>... Content: (annotation?, ((simpleType | complexType)?, (unique | key | keyref)*)) </element> not looking at the attributes one could write: element ::= (annotation?, ((simpleType | complexType)?, (unique | key | keyref)*)) >This means there is, essentially, one inference rule per context free grammar production. This is the part I no longer understand, because looking at the recommendation Chapter 3.3 I can find many rules restricting the above production. e.g. 3.3.4. 3.1 If {nillable} is false, then there must be no attribute information item among the element information item's [attributes] whose [namespace name] is identical to http://www.w3.org/2001/XMLSchema-instance and whose [local name] is nil. One rule is refering to the context of the element: 3.3.3 2 If the item's parent is not <schema>, then all of the following must be true: This kind of rules e.g. rules refering not to the right side of the production rule (for some attribute element konfiguration) but to the context in which the left side was present. In this case: schema :: = ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*) So if the element was taken from here, a rule applies to it then if it comes from this production: sequence ::= (annotation?, (element | group | choice | sequence | any)*) So I assume the whole language is no longer context free and therefore no longer in the Chomsky Hierachie 2 (context free grammars). Please comment!! The problem that I have with not knowing how to classify the XML Schema language is the following. I am currently testing parsers and their compliance with certain parts of the recommendation of the XML Schema language. I would assume (feel free to correct me) that if XML Schema was truly context free, that a parser that could correctly validate against one production rule (say: sequence ::= (annotation?, (element | group | choice | sequence | any)*) ) could do that no matter where in a XML Schema schema the production rule occurs. If on the other hand there is a finite set of rules (dealing with context or content) to all production rules could one classify those rules to again prove that a parser can validate XMl instance against one (or all) production rules in any kind of context using a finite number of test cases? Another question in the same direction: How does the concept of a context free language deal with the problem that element can be a terminal (e.g. within a sequence) and a non-terminal (e.g. when containing a complexType) Thank you for your help, Gregor
Received on Monday, 9 May 2005 17:51:16 UTC