W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2000

Re: Representing LALR(1) grammars in a schema?

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: 11 Sep 2000 09:21:19 +0100
To: Jeff McCarrell <jwm@emptech.com>
Cc: xmlschema-dev@w3c.org
Message-ID: <f5bvgw349vk.fsf@cogsci.ed.ac.uk>
Jeff McCarrell <jwm@emptech.com> writes:

> Hi.  I need to describe a little language in XML, and I want to write a
> schema to validate the XML stream.  The language is described by
> a grammar, and it is embedded in a bunch of other stuff.
> 
> My (quick) reading of the XML Primer leads me to believe that the
> most complex class of things that can be described by a schema
> are regular expressions, and that it simply isn't possible to translate
> my grammar into a schema.

Schemas have a grammatical expressiveness very similar to that of
XML DTDs.  Depending on how you look at it, this is close to Context
Free, or close to Regular.

1) Context free, because if you ignore the non-terminals, the
languages of the pre-terminals is more than regular:

  <complexType name='bpt'>
   <element name='leftParen'/>
   <element name='bp' type='bpt' minOccurs='0' maxOccurs='1'/>
   <element name='rightParen'/>
  </complexType>

accepts balanced pre-terminal strings of the form

 <leftParen/>n<rightParen/>n

which is not a regular language.

2) Regular, because if you _don't_ ignore the non-terminals, you get a 
regular language at any particular depth in the XML tree (not
surprising, really).

Hope this helps.

ht

[Note this list is really for implementation issues, comments about
the XML Schema language as such should go to www-xml-schema-comments@w3.org]
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
          W3C Fellow 1999--2001, part-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
Received on Monday, 11 September 2000 04:21:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:18 GMT