Re: Draft description of new TAG issue TagSoupIntegration-54 from Bjoern Hoehrmann on 2006-10-25 (www-tag@w3.org from October 2006)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Thu, 26 Oct 2006 01:54:18 +0200
To: ht@inf.ed.ac.uk (Henry S. Thompson)
Cc: www-tag@w3.org
Message-ID: <cglvj2tvf1cls1qp7knk7rq44i2f4jumff@hive.bjoern.hoehrmann.de>
* Henry S. Thompson wrote:
>On its telephone conference earlier today, the TAG agreed to open a
>new issue, TagSoupIntegration-54.  This message contains a first draft
>of the description of this issue for the issues list [1]. Comments and
>suggested changes are invited, as experience to date suggests that
>getting a satisfactory definition of exactly what's at issue here is
>tricky.

I would expect such text present issues in clear technical terms, in the
context of the group's charter, and be phrased in a way that resolutions
would have tangible practical impact. Instead of doing that, your text
deliberately spreads fear, uncertainty, and doubt. A good example is the
remark you included at the end of the issue:

>Estimates of the percentage of HTML-family web-pages currently being
>served which are neither well-formed XML nor SGML-valid HTML vary widely:
>a quick sample of reports gives 1.5%, 80%, 82%, 91%, 97.8%, 99% and 99.3%
>for different sample spaces and different times!

You are trying to abuse random numbers to create the impression we do
not have any remotely accurate figures on the subject matter. In doing
so you rely on undefined terms like "HTML-family web-pages" and, as you
do in most parts of your proposal, pleonasms like "well-formed XML" and
"SGML-valid HTML", and you are trying to present trivial fact (if you
evaluate insufficient samples, your conclusions about the statistical
population will vary widely, for they are fallacious) to support your
appeal to fear.

Another good example is the following:

>Heretofore W3C official policy has been not only to encourage the
>'withering away' of non-XML content on the Web, but to insist on it.

Here you invent a policy to support your cause. It is trivially evident
that no such policy can exist if you simply look at W3C's own Technical
Report Publication Policy: it neither insists nor encourages use of any
form of XML in W3C Technical Reports. Besides, compared to non-XML
content, there is virtually no XML content on the Web, so even if there
was a policy as you claim, it would be completely ridiculous.

Let's look at another example:

> * Should "as if" number (2) above be extended (contra recent TAG
>   finding Authoritative Metadata [2]) to include some form of
>   'sniffing'?

This is an appeal to ridicule; if you assume that TAG findings represent
common sense, acting contrary to them would appear to be foolish. As you
present "'sniffing'" as contrary to the finding you pretty much pre-
determined the answer to your question, there must be no 'sniffing'. Of
course, whether such "'sniffing'" would be contrary to the finding de-
pends on the definition of "'sniffing'", which you carefully avoided to
provide. It appears that you are trying to maintain a high level of
cognitive distortion throughout your draft.

I understand from your draft that you've never been involved in any of
the discussions around the subject matter; otherwise you would know that
these discussions are incredibly confused because the participants rely
on terms whose meaning is not agreed upon by the participants, and as a
consequence would have established crystal clear terminology to support
the discussion--assuming you are actually interested in discussion.

>*By 'tag soup' HTML is meant documents which are not well-formed
>XHTML, or even SGML-valid HTML, but which none-the-less are
>more-or-less successfully and consistently rendered by some HTML
>browsers.

As an example, to quote the former HTML Working Group's Chair, "if I
serve a document as text/html, I am asking for it to be processed as
HTML"; then, given a document like http://www.w3.org/ in order to de-
termine whether the document is "SGML-valid", you use the HTML 4.01
SGML declaration, and find that http://www.w3.org/ is 'tag soup', per
your definition. It is of course incorrect to process HTML documents
with a conforming SGML parser--as anyone who read the HTML 4.01 Re-
commendation would know--so the relevance of "SGML-valid" pretty
much escapes me.

So in conclusion I'd rather call this TAGSoup.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Wednesday, 25 October 2006 23:54:33 UTC