Re: ID/DTD/c14n/dsig mischief?

> My immediate reaction to any omission of data is 'document closure',
> i.e. if an article of data is necessary to the interpretation of a
> document, then don't omit it.  

Yes.

> It often happens that only the application designer is capable of 
> fully assessing this.
>
> In the case of the DTD, either design the application so that the DTD is
> largely superfluous or, if it is needed, then find a way to include it
> in the signature, even if C14N excludes it.

This is the bit that worries me:  is there a danger that only a handful of 
people have the necessary competence to use this specification correctly?
In particular, my guess is that an application designer, even one with a
security background, will be hard put not to stumble into security traps
caused by XML nuances.  The guidance in the Security Considerations section
is very good, but it may have to cover a lot more territory still if we're 
to have a better than even chance of getting people to produce secure 
applications.

> One could put a Reference to the dtd file and an Object containing a 
> copy of the internal DTD subset in the signature before signing.  After 
> a core signature validation, the application could check to make sure 
> that no change of DTD occured.

Fair enough.  I had thought of the (non-XML) signature over the external DTD
but hadn't thought of including a copy of the internal DTD as PCDATA or CDATA;
neat.  

However, doing this requires going well outside what DOM 2 provides, and 
maybe even outside what SAX 2 provides, which seems to go against the efforts
in dsig and particularly c14n to use information that is preserved by SAX and
DOM.

[I've just had the pleasure of writing the character-munging code to copy the
DTD from the original version to the edited version of a document that is
being massaged by SAX or DOM;  all I have to say is bah humbug].

I'm not sure about the feasibility of this:

> either design the application so that the DTD is largely superfluous [...]

because ID attributes are awfully useful if the graph is anything more fancy
than a tree.

Can we state a sufficient set of conditions for a language/DTD to be safe?
(I'm not sure that _I_ can -- the ID-attribute hack was the obvious one 
that occurred to me, but are there more subtle possibilities involving e.g. 
whitespace normalization in different attribute types?).

I've also been wondering about XML Schema.  In some ways the situation is a
lot better than with DTDs, because 

    (a)	it's easy to sign schema info because it is expressed as XML elements 
	(hence visible in SAX and DOM), and 

    (b)	xsi:schemalocation attributes are a hint that a schema processor is 
	free (even encouraged?) to ignore.  

On the other hand, it seems as though you could have some fun if you knew that 
the signer's schema processor did honour xsi:schemalocation and the verifier's 
schema processor did not, or vice versa.

Thomas Maslen
DSTC

Received on Friday, 13 July 2001 04:25:41 UTC