Re: Is XHTML 1.0 2nd ed. Section 4.3 really informative? from ITO Tsuyoshi on 2001-12-24 (www-html@w3.org from December 2001)

From: ITO Tsuyoshi <tsuyoshi@is.s.u-tokyo.ac.jp>
Date: Mon, 24 Dec 2001 14:06:02 +0900 (JST)
To: www-html@w3.org
Message-Id: <20011224.140602.68556228.tsuyoshi@is.s.u-tokyo.ac.jp>
Dear list,

William F. Hammond <hammond@csc.albany.edu> wrote:

> > In addition, if the intent of Section 4 is to explain by examples what
> > are Conforming XHTML Documents and what are not, I think the sentence
> 
> No, that is not the intent of section 4.
> The title of section 4 (which is informative) is:
> "Differences with HTML 4" .

I agree.

> > If that restriction is not a necessary condition for Conforming XHTML
> > Documents but merely a suggestion for them to be compatible with
> > existing HTML parsers, the word ``must'' in the sentence of Section
> > 4.3 which I quoted before might be confusing:
> > > All elements other than those declared in the DTD as EMPTY must have
> > > an end tag.
> 
> It is a requirement; "must" is correct.

Which normative part of the Working Draft states that that restriction
is a requirement?

Let me summarize what is the matter.  An informative Section 4.3 says:
> All elements other than those declared in the DTD as EMPTY must have
> an end tag.
The problem which I think the current Working Draft has is that it is
unclear whether this restriction is a mandatory requirement on a
document for it to be a Strictly Conforming XHTML Document.  If it is,
the restriction should be stated somewhere in normative part in the
Working Draft, but I cannot find it.  It not, the word ``must'' should
not be used.  I do not know which is the intent of XHTML 1.0
specification.

I agree that it is a good custom to always write end tags for such
elements even if their contents are empty, because many existing
``loose'' (or ``tag soup'') HTML parsers are likely to be confused by
shorthand representations for empty elements such as ``<span />'' .
However, as you know, a good custom is one thing and a mandatory
requirement is another.

> > > ITO Tsuyoshi <tsuyoshi@is.s.u-tokyo.ac.jp> wrote:
> > > > To me, it is natural to forbid shorthand representation such as
> > > > ``<span />'', because HTML 4.01 parsers might regard it as the
> > > > beginning of an element, look for the corresponding end tag and get
> 
> A user agent that does _correct_ parsing would never confuse HTML 4.01
> with any form of XHTML.  Most old widely distributed "browser class"
> user agents, however, do what is casually called tag soup parsing

What I wanted to say by the paragraph which you have just quoted is
that if XHTML specification prohibits representations like
``<span />'' , then I understand the intent of this prohibition and I
approve it.  Not that I meant to claim that it is better to prohibit
them (by using the word ``must'') than to leave a choice to authors of
XHTML documents; I have no opinion about which is better, at least for
now.  I did not mean to discuss about the implementation of
``correct'' and ``loose'' HTML parsers, either.  In fact, I know
little about HTML parser implementation.  Excuse me if I confused you.

> Please note further that writing
> 
>      "De facto empty bold tags written this way (<b />) are
>       legal but unwise."
> 
> in XHTML is very likely to confuse a "tag soup" agent.  The use of
> such markup is very unwise in XHTML.

I cannot see your point; where did this phrase come from and why does
stating this in XHTML specification confuse loose parsing agent?  I
think I must have misunderstood what you mean....

> [ IMHO it was a bad choice in the design of XML not to provide
>   syntactic delineation of defined-empty elements.  That is,
>   the forms "<foo></foo>" and "<foo/>", which are equivalent,
>   should not be equivalent.  The latter should have been necessary
>   and sufficient for "foo" to be a defined-empty element.          ]

Though I cannot say whether the current design of XML is good or bad
because of my lack of experience with XML, I see your point in that if
``<span></span>'' and ``<span/>'' were two different XML document
fragments, at least one of the compatibility issues of an XHTML
document with HTML parsers would be solved by simply allowing only
``<span></span>'' .

I hope this helps.

Best regards,
-- ITO Tsuyoshi  <tsuyoshi@is.s.u-tokyo.ac.jp> --
-- Department of Information Science           --
--                  in the University of Tokyo --
Received on Monday, 24 December 2001 00:06:05 UTC