W3C home > Mailing lists > Public > xmlschema-dev@w3.org > January 2003

Re: Problem with Element Definition and XML CDATA

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 9 Jan 2003 10:04:22 +0000
Message-ID: <182117054.20030109100422@jenitennison.com>
To: Matthew Jaquish <mjaquish@cisco.com>
CC: xmlschema-dev@w3.org

Hi Matthew,

> Can I define an element to contain a string OR a CDATA section, and
> not contain an empty string? In a schema, is there any way to allow
> an element to contain CDATA? I couldn't find it in the
> specification.

Schema validation works over the Infoset. The Infoset defines what's
important in an XML document and what's unimportant. One of the things
that the Infoset says is unimportant is the presence of a CDATA
section. So any process that works over the Infoset (including schema
validation) doesn't see any difference between using a CDATA section
and not using a CDATA section. In other words, to a schema validator:

  <description>some text or numbers 1234</description>
  <description><![CDATA[some text or numbers 1234]]></description>

are seen as *exactly* the same document.

> My problem: When I use an element definition and set it to a string
> datatype with a minimum length of 1, I get a validation error when
> it contains a CDATA section. I want to allow a string or a CDATA
> section as valid content for the same element, and I do not want to
> allow it to be blank.

Whatever validator you're using (which one is it? - name and shame!)
is buggy. Probably it's using a DOM (which does preserve CDATA
sections) and not interpreting that DOM correctly. Does it have
problems with entity references too? Send a bug report to the
implementer.

In the meantime, you can use the following transformation to get rid
of the CDATA section (and any entity references), and so enable your
schema validation to work:

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
  <xsl:copy-of select="." />
</xsl:template>

</xsl:stylesheet>

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Thursday, 9 January 2003 05:04:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:35 GMT