W3C home > Mailing lists > Public > www-html@w3.org > December 1999

Re: Tag Soup (was: FW: XHTML)

From: Murray Altheim <altheim@eng.sun.com>
Date: Wed, 08 Dec 1999 12:14:07 -0800
Message-ID: <384EBC0F.15535E8C@eng.sun.com>
To: Arjun Ray <aray@q2.net>
CC: www-html@w3.org
Arjun Ray wrote:
> 
> On Tue, 7 Dec 1999, Murray Altheim wrote:
> 
> > What in the hell are you trying to  accomplish?
> 
> Not to put too fine a point on it (I do *not* mean to impugn anyone), a
> call for honest specs for HTML *as it is practiced*.

Well, not to get personal, as I often agree with your points of view, but
I'm simply trying to be pragmatic within the bounds of who and where we're
working. There's a lot of barriers to *intelligent* specification of
real-world markup, simply because there is no global logic to the markup
produced by a mass of unconnected individuals, all reading a variety of 
poorly-to-not-so-poorly written books about HTML. (Perhaps a good book
about HTML is not possible from an authoring perspective.)

> > What good does attempting to devalue attempts at cleaning up the mess
> > out there really do for the community?
> 
> IMHO, "cleaning up the mess" is the wrong way to look at either what's
> really going on out there or what's needed.  The Mess is *not* causeless;
> it *can* be rationalized; rationalzing it is the way to put closure on it,
> in the sense that people now know what it is, rather than feel that it's
> some protean monster given that the existing specs patently *fail* to
> acknowledge or account for it.

It can only be rationalized if it can be regularized. Think "canonical HTML"
in the same mold as the Canonical XML draft. No can do. 

> The Mess is here to stay.  The point is to develop *alternatives*.  The
> first thing an alternative needs is to be distinguishable.  From what?  A
> protean monster or an accurate description - those are the choices.

I would argue that this is impossible. There may be a way to develop a
spec (DTD or schema) that was sufficiently lenient enough to handle "HTML
as it is practiced" but the only way to do that in XML is have everything
declared as content model="ANY" and even that wouldn't do because most 
"HTML" markup isn't even well-formed. There is no model for what that 
crap is.
 
> > Yes, the specs may be irrelevant
> 
> To current practice, desperately in need of *some* formal description,
> they are.

You can only have formal specification for crap if crap has commonalities.
And so far, the only thing I can tell you is that it smells, not what
shape, color or size it is.
 
> > and we're all misguided fools without a clue on a useless mission to
> > create twisted specifications that nobody reads
> 
> I appreciate why you (and others on the various W3C working groups) are
> put out, and I apologize for stating my case more strongly than was
> politic.  But the fact is that you are *developing* specs, you are not
> describing or formalizing current practice.  At the very least, the *fact
> that there is a difference* needs to be brought out.

Over the last few years I've come to the comclusion that the *only* 
possible and valuable thing one can do is to specify a document type.
Any forays into the muck of current practice are sure to fail, but
you're welcome to try. My feeling is that if you're going to declare
conformance requirements, you must do so in accordance with specs that
allow that to be determined by tools. A completely open conformance
spec is by definition useless. Anything conforms.
 
> If I wrote up a spec for Tag Soup, could it be accepted as a Note?

W3C Notes are not the perview of a specific WG; they are accepted or
rejected by the Big Man hisself. So go ahead and submit. No skin off
anyone's nose. Rick Jelliffe has been suggesting to me something he
calls "Open HTML" or "Open XHTML" (can't remember right now), with 
content models for all XHTML elements declared "ANY". It would allow
one to "validate" that all element types are those declared in a
DTD and that the attributes are allowed on specific elements. I can
see the value in that, but I'm not sure if allowing such a proposal
to move forward would in the end be counterproductive or not. 

I don't think it would be fair to call such a beast "XHTML" or relate
it to the formal spec of XHTML in any way. The W3C has *never* written
a formal definition of the term "tag set" or "vocabulary" that would
be useful in this context, and I think without that your Note would
be without formal grounding. Not to say it would be completely useless.
I've had a DTD for many years that I "auto-generated" (ie., a bunch of
changes in vi) into an "ANY" DTD, but I wouldn't think of distributing
it because I think it would *damage* the idea of what a markup language
actually is. (So in effect, I've had a "tag soup" DTD for 3-4 years).

Murray

...........................................................................
Murray Altheim                                   <mailto:altheim@sonic.net>
Member of Technical Staff, Tools Development & Support
Sun Microsystems, Inc. MS MPK17-102
1601 Willow Rd., Menlo Park, California 94025  <mailto:altheim@eng.sun.com>

   the honey bee is sad and cross and wicked as a weasel
   and when she perches on you boss she leaves a little measle -- archy
Received on Wednesday, 8 December 1999 23:20:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:40 GMT