re: XMLP WG Response on "SOAP and the Internal Subset" from Rich Salz on 2002-12-11 (www-tag@w3.org from December 2002)

From: Rich Salz <rsalz@datapower.com>
Date: Tue, 10 Dec 2002 21:32:10 -0500 (EST)
To: Larry Masinter <LMM@acm.org>
cc: "www-tag@w3.org" <www-tag@w3.org>, "ietf-xml-use@imc.org" <ietf-xml-use@imc.org>
Message-ID: <Pine.LNX.4.44L0.0212102121030.28307-100000@smtp.datapower.com>

I haven't digest all your note yet, but this did immediately come to mind:

> It would be useful to define XMLP in terms of the 'canonical InfoSet':
> the Infoset of the RFC 3076 Canonical XML of the document. In
> particular, all entities are expanded and DTDs removed from the
> Canonical XML.
>

Are you really advocating a third definition of XML, putting the
XPath1.0 model on a par with XML1.0 and the XML Infoset?

> > Security is another concern.  Although we have not formally
> > demonstrated that XML with internal subset is less secure, several
> > members of the workgroup shared an intuition that entity
> > substitution, attribute defaulting, and other manipulation of the
> > message content was more likely to lead to security exposures,
> > denial of service attacks (e.g. the billion laughs entity attack),
> > etc.
>
> Any message from any unauthenticated source introduces the potential
> for a denial of service attack, merely from the possibilities of
> overly long URI paths, element names, attribute values, content,
> etc. When parsing any message from an unauthenticated source, it's
> necesasry to insure that parsing the message doesn't consume undue
> resources in the receiver.  The parsing and substitution of entity
> definitions is just one of many such considerations. ...

DoS is only one issue; the other -- and in my view, the more important
one -- is that my server doesn't chase down external URL's just because
someone defined an entity in a DTD.  Was that not clear from
the original note?

> things more complicated. What is the complexity cost of receivers
> ignoring processing instructions vs. explicitly checking for them and
> disallowing them?

Well, by saying "don't send them", then a message that includes PI's
is out of spec, and the receiver can do whatever it wants, including
acting on them.  Mandating "ignore them" seems like more work, to me.
        /r$

Received on Tuesday, 10 December 2002 21:32:11 UTC