Re: XML validity and namespaces

Rich Salz writes:

> XML validity is important, and perhaps should be separated from DTD's.

I have commented a few times on what I take to be one of the great ironies 
of XML:  The "Extensible" Markup Language is very extensible with respect 
to constructs such as attributes and elements, but very inflexible in 
(not) allowing for replacement or evolution of its core mechanisms such as 
DTDs. 

As we discovered when we did XML Schema, the problem extends somewhat 
beyond the definition of validity.   Per the XML Rec, only DTD's can be 
used to define the entities that resolve &XXXX; references.  To use Schema 
to define such entities, we would have had to introduce a completely 
different non-XML processing model (I.e. to do the resolutions before XML 
could see them), and we would in that respect have been nonconformant with 
XML 1.0.  Of course, proper resolution of internally defined entities is a 
prerequisite to validity checking.

Also, syntactically, only a DTD can be used as an internal subset.    All 
attempts I've seen to inline schemas into the instance wind up with the 
schema as part of the element tree, where it doesn't belong.  Amusingly, 
this means that if the schema is for the whole document, it has to 
validate its own existence!

E.g.

        <song xmlns="musicURI">
                <xsd:schema  targetNamespace="musicURI">
                        ...schema here...
                        ...has to tolerate <xsd:schema> as child of
                        ...<song>
                </xsd:schema>
                <verse>....</verse>
                <verse>...</verse>
                <chorus>...</chorus>
        </song>

The schema appears to be part of the song.  To get around this, a few 
systems are today supporting special purpose constructs along the  lines 
of:

        <containerKludge>
                <xsd:schema  targetNamespace="musicURI">
                        ...schema for song element here...
                </xsd:schema>
                <song xmlns="musicURI">
                        <verse>....</verse>
                        <verse>...</verse>
                        <chorus>...</chorus>
                </song>
        </containerKludge>

which turns the whole document into a container rather than a song.

So yes, DTD's are baked all too deeply into XML.  The definition of 
validity is just one aspect of that I think.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Rich Salz <rsalz@datapower.com>
Sent by: www-tag-request@w3.org
04/03/05 02:25 PM

 
        To:     www-tag@w3.org
        cc:     (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        XML validity and namespaces



DTD's are ignorant of namespaces.  This means that you cannot write a
normative DTD for a namespace (as you might do with XML Schema). The
best you can hope to do is write something for expository purposes,
using particular namespace prefixes as an example.  Each instance of
a document would then have to rewrite the DTD to use the namespace
prefixes that are used in the document.

If the document uses elements from multiple namespaces, however,
and even if you can collect all the DTD's and rewrite them to map
the prefixes used in a particular document instance, you can't do
this for every case.  Viz:

    <tns:Foo xmlns:tns="http://example.com/1999">
        <tns:Foo xmlns:tns="http://example.com/2002">
            content
        </tns:Foo>
    </tns:Foo>

It seems to me, then, that DTDs are not useful, and maybe not
even possible, for XML standards or documents that use namespaces.
The problem with this is that XML validity requires a DTD (see [1]).
XML Schema, in a round-about way, enforces ID attribute uniqueness,
but only for that part of the document that is being validated;
if the schema does not start at the document root, there is no
guarantee.  The desire for composability means comparatively few
schemas (at least horizontal ones, such as developed by standards
organizations) will cover the entire document.  XML Schema may
also enforce other aspects of XML validity; I am not familiar enough
with the specs to say.

XML validity is important, and perhaps should be separated from DTD's.

        /r$

[1] http://www.w3.org/TR/2004/REC-xml-20040204/#dt-valid

-- 
Rich Salz                  Chief Security Architect
DataPower Technology       http://www.datapower.com
XS40 XML Security Gateway  http://www.datapower.com/products/xs40.html

Received on Sunday, 3 April 2005 18:55:10 UTC