Re: "Outline" algorithm from James Graham on 2009-05-26 (public-html@w3.org from May 2009)

From: James Graham <jgraham@opera.com>
Date: Tue, 26 May 2009 11:08:58 +0200
To: Larry Masinter <masinter@adobe.com>
CC: HTML WG <public-html@w3.org>
Message-ID: <4A1BB1AA.50404@opera.com>
Larry Masinter wrote:
> I picked another page of the HTML spec at random to investigate the question
> 
> Next pick was across Secton 4.1.11.1 "Creating an outline"
> 
> which contains a lengthy description of an algorithm for computing
> an "outline".  Now, is there any precedent, common practice, or
> "cowpath" for outlines? Is there any reason why authors would care
> exactly about the computation of the "outline"?

The outline is just a description of how the document is structured. I 
would find it much more surprising if the kind of conscientious author 
who cares about specs in general wouldn't care about the logical 
structure of their document.

For specific examples of tools that create outlines consider the anolis 
spec generator [1], mediawiki [2] (both add tables of contents based on 
outline structure), the Firefox Headings Map plugin [3] (creates a 
document outline in a sidebar for navigation of long pages) or many 
screen readers (to provide navigation; [4] suggests this is very popular 
with users). It would be very surprising if these tools all gave 
different results on the same input and possibly harmful e.g. in the 
case where a sighted user makes incorrect inferences about how their 
document will be perceived by a screen reader user.

> As far as I can tell, there are 41 occurrences of the word "outline"
> in the spec, and the only reference to this algorithm is in section 4.8.1
> on determining a caption for an image, which says:
> 
> # 3. Run the algorithm to create the outline (page 192) for the document.
> # 4. If the img element did not end up associated with a heading in the outline, or if
> #  there are any other images that are lacking an alt attribute and that are
> #  associated with the same heading in the outline as the img element in question,
> #  then there is no caption information; abort these steps.
> 
> which I can't tell if there is any reason why HTML should contain
> such a thing, that it would correspond to common practice, or any
> requirement listed in the "design principles".

It is not clear to me why you are assuming that HTML should only specify 
things that are useful to other parts of the spec rather than also 
specifying things that are useful to consumers of HTML.

However if you want a concrete spec-wise reason for having this 
definition consider the possibility of adding a heading-level 
pseudo-class to CSS (so e.g. :heading-level(1) would select all 
top-level headings). With a well defined outlining algorithm in HTML 
this is a relatively simple addition. Otherwise it is essentially 
impossible. For a demonstration of the value of this consider a "river 
of news" style feed aggregator that pulls content from different sources 
all using a different heading-numbering convention. At present any such 
aggregator has to rewrite all the heading numbers from the aggregated 
content to fit in with the style of the aggregator, which places a 
dependence of the internal filtering logic on the output template. With 
such an addition to CSS it would be trivial to style the content 
correctly on the client side.

> So the whole section doesn't seem to make a lot of sense. A lot
> of normative requirements for a complex implementation for computing
> an outline with no applications?

It seems rather impolite to proclaim that there are no applications of 
something just because you cannot think of any.

[1] http://anolis.gsnedders.com/
[2] http://www.mediawiki.org/wiki/MediaWiki
[3] https://addons.mozilla.org/en-US/firefox/addon/7203
[4] http://www.webaim.org/projects/screenreadersurvey/#headings
Received on Tuesday, 26 May 2009 09:09:52 UTC