Re: [XHTML2] CITELANG, TITLELANG attributes from Ian Hickson on 2004-07-28 (www-html@w3.org from July 2004)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 28 Jul 2004 12:17:32 +0000 (UTC)
To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
Cc: www-html@w3.org
Message-ID: <Pine.LNX.4.58.0407281146150.2401@dhalsim.dreamhost.com>
On Wed, 28 Jul 2004, Jukka K. Korpela wrote:
>>
>> If the title is the textual content of a child element, you have to:
> [ do something more complex ]
>
> I'm not sure I understand the complexity. Can't a parser simply recognize
> each <title> element as it sees it and associate it with the internal data
> structure corresponding to the parent element?

HTML pages are dynamic. Script can dynamically modify elements,
attributes, and so forth, all on the fly, even while the "title" is being
displayed (and then authors expect the UI to be updated on the fly too).

So no. The parser can't do any of this, the parser can only create the
initial DOM tree.


>> To summarise, elements are _hard_.
>
> I still don't see the problem, I'm afraid, but if elements are _hard_,
> then the problem is in the very idea of markup, which revolves around
> elements. Attributes are just properties of elements. If you change
> something that is in essence a container for textual data (which might
> need some inline markup), hence something that should be an element in
> markup, into an attribute containing plain text, for efficiency of
> implementation, then I think it's time to consider where this all would
> end.

It's pretty simply really. As a general rule, things you want to have
render inline should be in elements, and things you want to have render in
UI should be in attributes. In other words: When the order of the content
doesn't really matter, or is directly mapped by CSS, when the presence or
absense of something is merely input to the rendering model, and when you
are expecting to render things with sub-pixel accuracy, then element
content is fine. When you are expecting to get the data into the form of a
simple string, for passing to API functions or submitting a string or
similar, then an attribute is more appropriate.

IMHO, anyway.


>> Note that simply saying "it must be the first element" or "you must not
>> nest these elements" and so forth doesn't get you out of any of this,
>> since it is trivial to mutate the DOM to get it into these states. The
>> behaviour has to be well-defined in all these cases.
>
> Sorry I fail to see the point here. Surely XHTML specifications need to
> define the semantics of valid constructs only.

That's the mentality that got us into the Tag Soup mess -- by not defining
what should happen when the author makes a mistake, you end up forcing
every UA to copy the market leader's error handling.

Specifications should define what UAs should do in _any_ scenario. The
CSS, XML, and SVG specs are quite well defined in that regard. The HTML
specs have traditionally been quite vague in that area.


>>> To take an analogous case, we currently have the CAPTION element which
>>> may be used (only) inside a TABLE element and the SUMMARY attribute
>>> that may be used for a TABLE element.
>>
>> Great example. Implementing "summary" in a meaningful way is
>> significantly easier than implementing "caption". By orders of
>> magnitude.
>
> But as I learned in this thread (thanks Anne!), the current draft has
> made <summary> an element, which sounds logical. Are you saying that
> this should be taken back?

No, because in the case of <summary> I would expect the content to be
shown inline, instead of the rest of the table. Much like "alt" should
never have been an attribute.


> (And all browsers implement the "caption" element, though poorly,
> whereas "summary" is virtually unimplemented, there's a mismatch between
> actual browser behavior and the difference in the difficulty of
> implementation that you refer to.)

"summary" is implemented in Mozilla beyond the requirements in HTML4.
Given that HTML4 says the attribute is "for user agents rendering to
non-visual media", it's unclear what you expect desktop UAs to actually
_do_ with it.


>>> I don't see the possibility as extremely rare. Consider a link - a
>>> typical element to which we might wish to assign a TITLE. If the
>>> document where the link appears is in French and the linked document
>>> is in German, for example, it would be very natural to make the
>>> "advisory title" contain the name of the linked document in both
>>> French and in German, in many cases.
>>
>> That is an very rare case.
>
> Is it?

On the global scale? Yes.


>> Like I said. Highly theoretical. :-)
>
> If we regard such issues as negligible, then I think many parts of the
> WAI recommendations should be rewritten. I'm especially thinking about
> the _priority 1_ requirement that all changes in language in a document
> be indicated in markup.

That does seem like a very specious requirement. I would be curious to
know what the practical, real-world reasoning behind that requirement was.


> Or should we read the current (and planned) situation so that authors
> are required to use Unicode language tags inside attribute values if
> there is a single foreign word in any such attribute?

Unicode language tags -- like Unicode BFCs -- will probably be quite
unpopular with experts, especially in a markup context.

How would you mark up mixed languages in text/plain documents?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 28 July 2004 08:18:00 UTC