- From: W. Eliot Kimber <eliot@isogen.com>
- Date: Wed, 19 Jan 2000 10:22:03 -0500 (EST)
- To: www-html@w3.org
Murray Altheim wrote: > > "W. Eliot Kimber" wrote: > > Murray Altheim wrote: > [...] > > I'm not objecting to enabling validation. I'm only objecting to the > > requirement that conforming XHTML documents must use a particular form > > of DOCTYPE declaration (or even have a doctype declaration at all). > > If we eliminate the requirement on DTD validation, we not only allow > for other types of validation, we loosen the conformance requirements > to the point where it becomes *less* possible to ascertain that a > given instance is XHTML, not more. I think this is the disconnect: it's not a question of *determining* whether or not a given document conforms to the XHTML spec, it's a question of a document being able to unambiguously *assert* that it is *is* an XHTML document. Validating the veracity of that assertion is a separate subject. >From a the point of view of a processor, you need to be able to unambiguously distinguish documents that are XHTML documents from documents that are not XHTML documents. Having found a document that asserts it is an XHTML document, you may then *choose* to validate it however you see fit, including using the XHTML-provided DTD declarations, some functionally-equivalent schema spec, or purpose-built code that happens to embody the rules of XHTML. But the validation is secondary (in the sense that useful processing can be done without validation). My point is that a DOCTYPE declaration cannot serve as the *unambiguous* assertion of XHTMLness (although of course it can enable validation against the syntactic rules of XHTML). Stress on the word "unambiguous." We take the use of particular external subsets (or rather, the use of particular identifiers for external subsets) as the assertion of typeness, but that assertion has potential ambiguity for all the reasons I've stated. If something is ambiguous it cannot be relied on to drive computer processing and should therefore be avoided if at all possible. Perhaps there's some way to > state: > "if you have some type of validation that works *better* than DTDs at > validating against the XHTML document type, then go for it with our > blessing," but I can't think of any way to do that. If we relax the > requirement for DTD validation, then *anything* goes, and an unbounded > definition is no definition at all, not in this environment. We have a > very limited set of tools available to us. Again, my point isn't about *validation*, it's about assertion of type membership. > > > > That's my point: *you have no means* provided by XML 1.0. The means I'm talking about is the means to assert type membership. This discussion started because it was proposed that *type membership* be indicated by disallowing the use of internal subsets and requiring the use of a particular external subset URI. Those sets of restrictions *do* provide reliable type assertion... ...but... ...the reason that Arjun and I objected to them in principle is that A) it's a set of restrictions that XML 1.0 provides no way to state (and that therefore normal XML parsers cannot detect or enforce) B) it unnecessarily restricts authors' choice about how to manage the DOCTYPE declarations of XHTML documents and C) it propogates the mistaken idea that DOCTYPE declarations assert type membership. It's not about validation--whether or not a document is validated is always the choice of the document receiver. It is inappropriate for a general-purpose standard to impose a validation policy on users of the standard. The standard must *enable* validation, but it cannot require it. So validation can't be the issue here. The issue is: do document authors have a clear way to assert type membership of XHTML documents and do processors have a clear way to detect type membership and, if so, does that mechanism impose any unnecessary or inappropriate constraints? My assertion is that the required use of external declaration subsets fails the last part of this test: it imposes unnecessary and inappropriate constraints on document authors. > SGML and XML are too flexible to not allow loopholes that can be > deliberately abused. I don't expect to catch those kinds of errors > in all cases; validation is not a security system. If we assume that > authors are well-intentioned but ignorant or careless, then DTD > validation provides a pretty good measure of how the structure of > a document's markup matches the declared type. Yes, I understand > the limitations of this. But as a machine process it is the one > best shot we have. But part of my argument about architectures is that they provide a type membership assertion mechanism that *cannot be subverted*. The owner of the architecture gets to define *a single name* by which the architecture is referenced and can provide a set of DTD declarations that cannot be modified by document authors. This means that processors can first detect an unambiguous type membership assertion (the architecture use declaration that uses the architecture name) and then, if desired, do normal XML syntactic validation of the document against the architectural DTD. There is no possibility of subversion of this process by document authors in ways that cannot be easily detected (such as hacking the URI resolver that fetches the architectural DTD which should, ideally, exist in exactly one place). > [regarding AF declarations....] > > But not to the beginning of the DTD, to the document, whether or not > > there's a DTD. > > But it would be *okay* to be in the DTD, correct? Yes, it's fine for it to be in the DTD, but that can't be the *only place* it's allowed by the XHTML spec. I also need to be able to have documents with no DOCTYPE declaration that have architecture use PIs (or the equivalent element-based syntax ala name spaces, which I suppose we could ammend 10744 to provide if the W3C powers that be simply will not accept PIs). But remember, what's important is type membership, which can be done syntactically any number of ways, including through the normative use of a name-space declaration. Is there any problem > with the declarations occurring twice (ie., if I was able to lobby > for default inclusion in the XHTML 1.1 DTD but somebody were to also > include it via some other method?). No, no problem as far as ISO/IEC 10744 is concerned. If you have something to propose > in this regard, send it into the W3C and perhaps they can standardize > a method for XML documents. Or send it into OASIS. (!) It's worth thinking about. > > > Since the beginning of XHTML m12n there's been an empty XHTML > > > module named "XHTML 1.1 Base Architecture" whose content looks > > > like this: > > > > This is cool. Keep it in. > > Yes, but it's incomplete. Undone. Needs work. Space for rent. Help wanted. I'll see if I can help here, but I've got a lot of standards work already on the stack... Cheers, E.
Received on Thursday, 20 January 2000 05:28:41 UTC