W3C home > Mailing lists > Public > www-html@w3.org > February 2000

Re: alt element (was: naming custom/extended tags)

From: Arjun Ray <aray@q2.net>
Date: Mon, 14 Feb 2000 14:49:30 -0500 (EST)
To: www-html@w3.org
Message-ID: <Pine.LNX.4.10.10002141223340.32075-100000@mail.q2.net>

On Mon, 14 Feb 2000 JOrendorff@ixl.com wrote:

> 'htign' does one job; 

It's part of a set, actually.  I was trying to cover only the simpler
cases for now.  I suppose I'll have to explain the whole scheme...

> 'alt' would do that job and a little more, and I think it would
> solve a few other minor problems.  The choice between 'htign' and
> 'alt' is, imho, one of complexity:  is 'alt' too complex for UAs
> to implement?

That would depend on whether the spec for ALT were detailed enough to
give an implementor a clear idea of what it calls for - which most of
the time means covering case scenarios comprehensively.

> An <alt> element wouldn't be any more "factitious" than, say,
> <noframes> or the use of <object>'s children as alternate content.

<noframes> is problematic.  In the "Netscape sense", it *is*
factitious and redundant.  In the "Explorer sense", it serves a
definite and arguably useful purpose.  <object> has problems besides
those associated with the alternate content semantic.  (I don't
consider either of these elements particularly useful as presently
"defined", but I'm not sure I would call them factitious per se.
I'll accept a good argument, though:))

> 'alt' would provide a way for the author to tell the UA what to
> use and what to ignore when it finds an unsupported tag.

And therein lie the problems.  At the point where the starttag of an
unsupported element is encountered, how much lookahead should be
planned for in search of its corresponding <alt>?  When there's more
than one such unsupported thingy on the open element stack, how should
the implementor figure out which <alt> "belongs" to which unsupported
element?  Even if he does manage to figure that out without indefinite
lookahead, what if the various alts call for incompatible treatments?
Note that for N open unsupported elements, the N alts can occur in
1.3.5...(2N-1) = (2N)!/(N!*2^N) different configurations.  Are you
sure the rules are robust with respect to this combinatorial maze?

Of course, all of this is happening because the <alt> is *separate*,
i.e. away from the critical decision point.  Attributes *on* the
starttag of an unsupported element don't have this problem.

There's the general case to consider: when multiple taxonomies are in
play.  Each taxonomy will need its own alt (because in general it
can't be assumed that one alt will cover all the taxonomies in which
the corresponding element is "unsupported".)  What's better, peppering
the hierarchy with <alts>, or keeping the hierarchy intact with
respect to this problem as a whole and using just attributes?

In fact, the possibility of multiple simultaneous taxonomies brings
the essential role of the element hierarchy into stark relief.  Each
start-tag can be viewed as a decision point (from the pov of the
application farming out work to the various modules it's trying to
coordinate): it makes sense for the decision criteria to be present
right there.  Attributes:)

But there's more.  In the 'namespaces' thread, I provided an example
of markup with multiple taxonomies:

 <myfoos html="table" db="dbView" dbQuery="select * from foo">
   <foo html="tr" db="dbRecord" dbKey="wefe142343">
     <bar html="td" db="dbField" dbCol="bar">bar-blah</bar>
     <baz html="td" db="dbField" dbCol="baz">baz-blah</baz>
     <blort html="td" db="dbField" dbCol="blort">blort-blah</blort>
   <!-- more <foo>s here -->

The pure 'decision point' perspective leads to this symmetric and
completely generic formulation

 <X myfoo="myfoos" html="table" db="dbView" 
        dbQuery="select * from foo">
   <Y myfoo="foo" html="tr" db="dbRecord" dbKey="wefe142343">
     <Z myfoo="bar" html="td" db="dbField" dbCol="bar">bar-blah</Z>
     <Z myfoo="baz" html="td" db="dbField" dbCol="baz">baz-blah</Z>
     <Z myfoo="blort" html="td" db="dbField" dbCol="blort">blort-blah</Z>
     <!-- more <Y>s here -->

The lesson is that, with multiple taxonomies, the generic identifier
(i.e. the 'tagname") is of very little consequence.  That particular
name is no more than the value of a "reserved" attribute.  Turning
this around, since a generic identifier is *morphologically* the value
of a (reserved and minimized) attribute, the symmetric and general
treatment of multiple simultaneous taxonomies involves schema-specific
generic identifiers always coming from *some* attribute.  Moving any
one such value to the GI position (after the '<') is a strictly
tactical decision, governed in the main by how much explicit markup we
might save through judicious application of defaults.  *Processing*
the markup, however, should be independent of this (because that's
what a parser is for: to fill in those defaults), triggering on
attributes only.

Attribute-based processing is the "great secret" of generalized
markup.  One such paradigm is architectural forms.  It works with a
set of "control attributes" (of which my 'html' and 'htign' were only
a suggwestive subset):


The important ones, from the perspective of directing the extraction
of the content relevant to any particular taxonomy, are those that
typically get used in starttags (or defaulted in ATTLIST declarations
in the DTD):

            ArcFormA  NAME    "html"
            ArcNamrA  NAME    "htnames"
            ArcSuprA  NAME    "htsupp"
            ArcIgnDA  NAME    "htign"
            ArcAuto   (ArcAuto|nArcAuto)  ArcAuto

(That is, there are five "axes" of control in the general case by
which the "pruning" of a tree can be specified precisely.)

This is a standardized technique, no more difficult in the general
case to use than any particular case, implemented in James Clark's SP
parser since 1996, and supported by the XAF package (in Java) for SAX
based XML parsers.

It's there.  It's general.  It works.

But you won't hear about it from the W3C, because they have some
dear-to-the-heart half-baked nostrums they'd rather peddle instead.

> If the server has a profile of the client's capabilities, <alt>
> can be processed server-side (whether this is even remotely
> desirable is questionable of course.)

Something like this rates to be the norm for the foreseeable future.
Netploder get a clue?  Don't hold your breath:)

Received on Monday, 14 February 2000 14:27:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:42 GMT