RE: How to use DTDs, or not to (was: RE: ACL and lockdiscovery) from Julian Reschke on 2003-10-11 (w3c-dist-auth@w3.org from October to December 2003)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sat, 11 Oct 2003 13:04:03 +0200
To: <dennis.hamilton@acm.org>, <w3c-dist-auth@w3.org>
Message-ID: <JIEGINCHMLABHJBIGKBCCEILIMAA.julian.reschke@gmx.de>
Comments inline...

> From: w3c-dist-auth-request@w3.org
> [mailto:w3c-dist-auth-request@w3.org]On Behalf Of Dennis E. Hamilton
> Sent: Friday, October 10, 2003 10:45 PM
> To: Julian Reschke; Lisa Dusseault; 'Geoffrey M Clemm';
> w3c-dist-auth@w3.org
> Subject: RE: How to use DTDs, or not to (was: RE: ACL and lockdiscovery)
>
> ..
>
> 1.1	I find it better to refer to the "DAV namespace" or "DAV
> namespace URI" rather than the "DAV: namespace" especially
> because namespaces are often versioned (rather than revised [;<),
> so the identification in XML will be with different URIs over
> time.  Ditto for the lock-token namespace.
> ..

Well. There are no plans to change the namespace URI for DAV:. Doing that
would be an incompatible change breaking all existing code (see for instance
how XSLT deals with versioning).

> 1.2	In the context of this discussion, the DAV namespace is not
> a barrier to having a DTD.  It is prefixes (that is QNames) that
> must be dealt with carefully when a DTD is used.  However, this
> is completely possible, as the XML Schema folk demonstrated.

Is it worth the effort? Can you supply (a link to) an example?

> 2.	You have your finger on a critical and important point.
>
> 2.1	It is not possible to create a generic DTD that can be used
> for validation of any DAV XML 1.0 "document".  That's
> specifically because of the RDF hack and the ability to ad lib
> property QNames.  (I call anything where there is an application
> technique that maps QNames to URIs as the RDF hack, since RDF
> seems to have been the first to do it. This creates a variety of
> problems, some of which are noted in a recent review of the
> latest RDF working documents.  See
> <http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0
> 017.html>.  I don't want to go deeper into this, though it has
> become an interoperability concern because of a problem
> concerning when the special application knowledge has to be
> applied to handle an RDF (or any RDF hack) correctly. [Aside:
> There is now an approach to (lexical) datatypes in the latest RDF
> working documents that could well be adapted to DAV property
> values at some point.]

I'm not sure where the hack is. In WebDAV, property names are identified by
namespaced XML names (as a pair of namespace URI and local name), not as
URI.

> ...
>
> 3.2	What I don't find in WebDAV (I haven't looked at 2518bis)
> is any clear specification of the relationship of DAV request
> bodies, DAV response bodies, and, for that matter, DAV
> documents-as-content, as an application of XML 1.0.

Some WebDAV methods use XML documents as request/response bodies. There is
no notion of DAV-documents-as-content. That's it.

> 3.3	In the examples in the DAV specification, an XML
> declaration (<?xml version="1.0" ...?>) is always present.
> Although not required by the XML 1.0 Specification, this is
> tantamount to a declaration that an XML 1.0 document is present
> (or something equivalent to an XML external entity).  This is a
> strong statement with regard to what one can count on:

Actually the content types we use indicate that as well, therefore the XML
declaration is not necessary. However there are clients/servers that fail
(or did fail) in absence of the XML declaration.

> ...
>
> 3.4	An example of (3.2) is the fact that it is not clear that
> the XML declaration is required for DAV XML documents and whether

As the spec doesn't say that it is required, it isn't.

> a Document Type Declaration is forbidden (as it is for SOAP, if I
> recall correctly).  I can't find where the DAV specification

As the spec doesn't say it's forbidden, it's allowed. Hoewever it does say
that servers/clients must not attempt to validate the documents.

> ...
>
> 	3.5.1 These are out-of-band agreements that are not part of
> XML 1.0 compliance but that impact how the DAV use of XML is to
> be interpreted.  An important consideration is whether the
> application is presumed to follow the basic XML processing.  That
> is, the application could be viewed as fed by an XML processor
> that is designed to operate in an application-ignorant manner.

Of course. Section 8 states:

"The following new HTTP methods use XML as a request and response format.
All DAV compliant clients and resources MUST use XML parsers that are
compliant with [REC-XML]. All XML used in either requests or responses MUST
be, at minimum, well formed. If a server receives ill-formed XML in a
request it MUST reject the entire request with a 400 (Bad Request). If a
client receives ill-formed XML in a response then it MUST NOT assume
anything about the outcome of the executed method and SHOULD treat the
server as malfunctioning."

> 	3.5.2 There is some application context of a transport
> nature that applies even to the XML 1.0 embedding in HTTP header
> and response bodies.  This is a problem that SOAP has addressed
> (I am not sure "dealt with" holds, though) and the SOAP HTTP
> binding and the WS-I Base Profile 1.0a are worthy of review with
> regard to coherence between HTTP, MIME types, and DAV XML 1.0
> bodies.  (I keep thinking there is a mistake in the WS-I
> conclusion about byte-order marks, but it is more valuable to
> have a standard that can be consistently followed than be right
> about that.  There is a miss-reading of the XML 1.0 specification
> about this, and it has been perpetuated in the folklore around SOAP.)

I'm not sure to what you're referring here...

> 	3.5.3 Side Note: There is no prohibition on unusual
> encodings for XML documents.  The DAV specification goes too far
> in presuming there is a limitation.  There is a requirement that

Where...?

> a processor always support utf-8 and utf-16.  In the absence of
> an encoding declaration [in the XML declaration or from context]
> it must be one or the other.  Note that the default encoding for
> MIME content type text/xml is ASCII. Note further that
> encoding="ASCII" does not entail encoding="utf-8".  And all of
> the DAV specification examples do match up encodings properly.
> Whatever the encoding methodology, XML 1.0 documents are always
> taken to be expressed in Unicode:  That is, the abstraction of
> the character stream out of the medium is always Unicode.  This
> means that encoding characters that have no counterpart in (the
> XML-specified subset of) Unicode may not be used.  This is

...such as control characters other then CR, LF and TAB...

> important with regard to XML 1.0 processors which should probably
> only see Unicode, on the way in and on the way out.  It would
> seem that there is a minimum subset of Unicode that any encoding
> must have corresponding character codes for in order to carry
> arbitrary XML 1.0 documents at all.

An encoding that can carry arbitrary XML documents must support all Unicode
characters allowed in XML names (as those can't be escaped).

> 	3.5.4 Because DAV XML 1.0 is an application of XML, it is
> wise to consider that all XML well-formedness and any DTD
> validation (if invited) or non-validation (even with a Document
> Type Declaration present) rules will have been carried out first,
> before the XML stream is delivered for further application
> employment (e.g, via a validating DOM processor).  And there may

A processor is allowed to deliver data to the application before it has
finished checking wellformed-ness. Otherwise XML processors could not be
used in streaming contexts. However, if a wf-error is detected, it must be
signalled and the application must abort processing.

> be a desire to carry DAV XML 1.0 in some neutral way as pure XML
> 1.0 documents.  So there is need for care here.
>
> ...
>
> 3.6	Coming back to the specific case of DTD,
>
> 	3.6.1	It might be more appropriate to specifically reject
> the use of Document Type Declarations in DAV XML 1.0 documents
> that are used in HTTP bodies and headers and certainly for a MIME
> type that asserts DAV application.

Why does it matter? The spec says that they must be ignored, and in real
life I haver never seen a client or server sending a DOCTYPE. I don't think
there's an issue here.

> 	3.6.2 This deals with section 17.7 of the specification
> too, because there are therefore no external entities and a DAV
> processor does not have to accommodate such things (or any
> internal entities other than the standard default ones, &amp;
> &lt;, ..., plus the encoding notation for Unicode characters).
> [You might want to review the XML 1.1 Working Draft in this
> regard too, just to see if there is anything to anticipate here.]

--> <http://lists.w3.org/Archives/Public/w3c-dist-auth/2002OctDec/0148.html>

> ...


Regards, Julian

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
Received on Saturday, 11 October 2003 07:04:19 UTC