- From: Murray Altheim <altheim@eng.sun.com>
- Date: Tue, 18 Jan 2000 15:17:58 -0800
- To: "W. Eliot Kimber" <eliot@isogen.com>
- CC: www-html@w3.org
"W. Eliot Kimber" wrote: > Murray Altheim wrote: [...] > I'm not objecting to enabling validation. I'm only objecting to the > requirement that conforming XHTML documents must use a particular form > of DOCTYPE declaration (or even have a doctype declaration at all). If we eliminate the requirement on DTD validation, we not only allow for other types of validation, we loosen the conformance requirements to the point where it becomes *less* possible to ascertain that a given instance is XHTML, not more. Perhaps there's some way to state: "if you have some type of validation that works *better* than DTDs at validating against the XHTML document type, then go for it with our blessing," but I can't think of any way to do that. If we relax the requirement for DTD validation, then *anything* goes, and an unbounded definition is no definition at all, not in this environment. We have a very limited set of tools available to us. > That is, the issue of being able to do validation and the issue of > knowing without ambiguity whether or not a given document claims to be > an XHTML document are two separate issues. Understood. We're simply tackling the part that the toolkit we've been given provides us. It doesn't keep someone from sticking an entire document in a <p>, but then, if you look you'll see very little by way of definitions of document structure in any HTML specification, and you will find almost *none* in DocBook. They only define the structures available in the type, not the type itself. Ie., the definition is a container or series of containers, not a structure in itself. There's nothing in HTML 4 that says it's *wrong* to put a whole document in a <p>, nothing in DocBook that says you can't use 50 <Sect4>'s as list items, with a <Sect3> as a list container. So in the sense that XHTML does not prescribe a strict document structure, validating that the markup structures makes sense is pretty close to the limit of what can be done. That a <li> is in a <ul> or a <ol>. Not great, not something to write home to mom about, but it's what we have. > That is, a document claims to conform to a type. You then have the > option of validating the document against the rules of that type. Using > a set of DTD declarations is part of that validation (but not all of > it). Yes, I do understand (completely) the limitations of restrictions that only affect a portion of the type definition. And I agree that Sun has much more restricted (and higher quality) authoring by trained writers using very high quality tools and a lot of support. It's one of the reasons I like it here. And yes, you're correct, this is very much due to people like Jon Bosak, Mike Rogers, Bill Smith, Eduardo Gutentag and others, all champions of SGML and XML at Sun. [...] > > > You do have an alternative: a namespace use declaration with a > > > meaning defined by the XHTML spec. > > > > And that would make XHTML different from every other XML markup language. > > No, I think it would make it consistent with many markup languages. The > only difference between a namespace declaration and an architecture use > declaration is the explicit ability to bind an architecture to a set of > architectural DTD declarations. > > I also point out that architectural processing *is currently > implemented* through SAXARCH, so it's not like there's no tools support > for architectures, just that it's not yet in IE5 (as far as I know). Yes, but as you know, the acceptance (or understanding) of such technology is highly limited within the W3C. You guys speak a foreign language understood by few people. SAX and SAXARCH are not standardized APIs even if the former is de facto part of most parsers. If a highly simplified AF API were proposed to the W3C and we had it in our toolkit, this would be an entirely different situation. But we don't, and to me this is all just speculation until we do. If you want to propose such a feature (eg., a standardized PI to attach to documents as a link to an AF declaration, so that every author doesn't have to create their own), then I'd be happy to work with you to champion its acceptance within the W3C. But you probably know as well as I this is unlikely to be accepted. > > > Just because XML provides the optional feature of DOCTYPE declarations > > > doesn't mean that XHTML is obligated to require their use or impart any > > > special meaning to their use when there are other was to get what you > > > want which are reliable. > > > > Such as? Supported by what tools? Something that given a cold day in > > hell would be accepted by the W3C? If we don't use validation via DTD > > we have no acceptable means to establish a document type at all. > > That's my point: *you have no means* provided by XML 1.0. No, we *do* have DTDs. That's why we're using them. They don't provide the type validation you're talking about, but they do guarantee that the markup in the document conforms to the DTD. > Therefore you > (and all other standardized XML applications) must do something else. > The only question is what? XHTML is, by it's nature, a groundbreaking > application (just as HTML was). > > > Perhaps you can clarify this for me: I have thousands of valid SGML > > documents that conform to document type definitions, using DOCTYPE > > declarations. > > No, you have thousands of documents that conform to document type > *declarations* that may or may not conform to document type > *definitions*. The conformance to the latter *is not in any way* > indicated by conformance to the former. Likewise, failure to conform to > the former does not necessarily indicate failure to conform to the > latter. I beg to disagree. You are correct that the possibility to abuse markup exists. When I go into a hospital, there are all sorts of sharp objects and drugs that could kill me. But the medical staff uses those tools with care and educated intention. If I thought they were deliberately going to kill me, I'd never enter a hospital. The question revolves around intention and negligence, not malfeasance. SGML and XML are too flexible to not allow loopholes that can be deliberately abused. I don't expect to catch those kinds of errors in all cases; validation is not a security system. If we assume that authors are well-intentioned but ignorant or careless, then DTD validation provides a pretty good measure of how the structure of a document's markup matches the declared type. Yes, I understand the limitations of this. But as a machine process it is the one best shot we have. I can't guarantee that a paragraph contains one idea, but at least I can be sure a paragraph element isn't inside the head of a document, or that other paragraphs don't occur in a paragraph, or that I didn't misspell a tag name. This is the kind of validation that is to many people valuable. I wouldn't chop off my arm simply because it isn't strong enough to lift something heavy. If in my environment I'm concerned about authors abusing the DTDs, I'll make sure the tools don't allow modification of them, either directly (chmod) or via tool validation that prohibits any modification of the prolog, or only general entities, etc. This becomes more a management issue than a technical one. [regarding AF declarations....] > But not to the beginning of the DTD, to the document, whether or not > there's a DTD. But it would be *okay* to be in the DTD, correct? The module could also then be used as an entity in its own right, and *if* some mechanism for declaring it at the beginning of a document available, we could provide information on how to do it. Is there any problem with the declarations occurring twice (ie., if I was able to lobby for default inclusion in the XHTML 1.1 DTD but somebody were to also include it via some other method?). If you have something to propose in this regard, send it into the W3C and perhaps they can standardize a method for XML documents. Or send it into OASIS. (!) > > Since the beginning of XHTML m12n there's been an empty XHTML > > module named "XHTML 1.1 Base Architecture" whose content looks > > like this: > > This is cool. Keep it in. Yes, but it's incomplete. Undone. Needs work. Space for rent. Help wanted. Murray ........................................................................... Murray Altheim, SGML Grease Monkey <mailto:altheim@eng.sun.com> Member of Technical Staff, Tools Development & Support Sun Microsystems, 901 San Antonio Rd., UMPK17-102, Palo Alto, CA 94303-4900 the honey bee is sad and cross and wicked as a weasel and when she perches on you boss she leaves a little measle -- archy
Received on Tuesday, 18 January 2000 18:17:01 UTC