One Lousy Attribute (was: XHTML Invalidity...)

On Tue, 12 Dec 2000, Sean B. Palmer wrote:

>> If/when folks get comfortable with schemas and
>> namespaces, we can drop the DTD gobbledygook at the top:

Yeah, scary isn't it?  Watch out for the Boogieman.

>>            <html xmlns="http://www.w3.org/1999/xhtml">
>>              [...]

This accomplishes precisely nothing, as was explained ad nauseum on
the www-xml-sig list.  (Isn't it *finally* time to make that archive
public?)

Instead of the handwaving with a URI of indeterminate (semantic)
content, we can do better *with* the um, gobbledegook.  It has the
virtue of having been an international standard since 1986, and the
incidental benefit that we know what's going on with each URI.

1. In the internal subset:

  <!-- Declare formal definition of a machine-processable format -->
  <!NOTATION XSchema 
             SYSTEM "http://www.w3.org/2000/08/XMLSchema"
             >
  <!-- Declare an entity formulated in this *particular* format -->
  <!ENTITY   xhtml-schema 
             SYSTEM "http://www.w3.org/1999/xhtml/schema"
             CDATA XSchema 
             >
  <!-- Enable a reference to this entity -->
  <!ATTLIST  xhtml
             foo    ENTITY    #IMPLIED
             >

2. In the instance:

   <!-- Reference the entity, at the same time pinpointing which
        particular formalism was used -->
   <xhtml foo="xhtml-schema">
   ...

[I named the attribute 'foo' deliberately to make the point that the
notation on the entity tells you all that you *need* to know: that a
schema of some kind exists (as the contents of the entity) and also,
much more importantly, which particular formal system it's encoded
in.  Given that, the mere name of the instance attribute need not be
relevant.  One *could* formalize it using a reserved name, i.e. using 
a 'xml' prefix, though.]

As a matter of fact, with the new provisions in WebSGML TC, the entity
declaration can be skipped in favor of a direct reference *as* the
external subset

 <!DOCTYPE xhtml SYTEM "http://www.w3.org/1999/xhtml/schema" 
           CDATA XSchema [
     <!NOTATION XSchema
             SYSTEM "http://www.w3.org/2000/08/XMLSchema"
             >
 ]>

But not quite yet in XML, admittedly.  See K.4.10.3 in

  http://www.ornl.gov/sgml/sc34/document/0029.htm
 
> Hopefully we will be able to say "when", but we all know that DTDs
> aren't going to just "vanish".

Denigration and disparagement have been known to work.  Meanwhile, the
idea is to poison as many things as possible with colonified names,
just to make recovery more costly (and hence defered/avoided) -
nothing like making a virtue out of um, "necessity".

> I can't help remembering who *suggested* we put that "DTD
> gobbledygook" at the top in the first place(!):

Water under the bridge.  We've all come a long way since then.

> > I'm asking for one lousy attribute,
> > xmlns="http://www.w3.org/1999/xhtml"
> 
> I don't believe it will *be* just one lousy attribute, because
> validating by namespace is only a very small part of the list of
> options in the XML Schema specification.

Yep.  And that's only the beginning.

> I agree that it may be the best way, but it's not the only way.
> You will have to allow for xsi:SchemaLocation, which will bump up
> the prologue aspect of validation again. You can't use "the file
> gets smaller and simpler" as an excuse to use XML Schemas for
> XHTML: instead use the fact that they have certain benefits over
> XML DTDs.

Not to mention that DTD syntax can be used to invoke the processing
of schemas in reliable ways.  See above.

>> the HTML modularization spec shows you how to add your own module
>> and mix it in with the standard  modules. I don't care for that
>> approach, 

With you there, but for different reasons:)

>> because it's limited in all the ways that linking two C modules
>> are limited:

It's about time this overplayed (and bogus) analogy were put to sleep.

>> one big unmanaged centralized namespace, no "first class" modules 
>> recognized by the compiler (the validator). 

A validator is not a compiler, because a validator is a recognizer,
not a parser.

>> But it has the virtue that existing validating XML processors
>> can be used for validation.

XML "validation" is useless.  It's crippled in relation to SGML
facilities, and even SGML has moved beyond the implicit assumption of
an encompassing architecture.

> I don't get your qualms there, "one big unmanaged centralized
> namespace"?

The "name clash" non-problem that SGML-bashers are still getting
mileage out of.  For a excellently written neutral presentation of 
the basic "problem",  see, e.g.  Q8.5 in the Namespaces FAQ 

  http://www.rpbourret.com/xml/NamespacesFAQ.htm

For a demonstration that the problem is illusory, using tools that
have existed since before XML was invented, see

  http://www.nyct.net/~aray/sgml/demo/dept/  

> It's a good approach. That way, instead of just declaring the
> namespaces in a DTD and not using them, you don't have a DTD at
> all and simply use the namespace to point to your Schema.

Except it has the same problem as the STYLE attribute (and to a large
extent, the STYLE and SCRIPT elements in HTML): an implicit assumption
that there is only *one* "relevant" formalism.  My prediction is that,
just like SCRIPT and STYLE, in the future for XSchema lies the problem
of "versions".  For which, of course, yet another reserved attribute
will have to be invented...

And now, to really give the W3C folks heebiejeebies, may I suggest a
perusal of this tutorial:

  http://www.isogen.com/papers/archintro.html

I also by chance discovered in old directory on an old machine this
piece by Eliot Kimber, a copy of which I can't locate elsewhere. 

  http://www.nyct.net/~aray/notes/rdfarch.html

(It was written in Aug 97, and references the 970801 draft of the RDF
spec.  It's mainly a source of ideas now, since the details have
changed.)

Enjoy!


Arjun

-- 
"The bottomline is that it is really difficult to solve a problem
 when the problem does not exist." - Masataka Ohta.

Received on Saturday, 16 December 2000 01:01:38 UTC