Miscellaneous buglets

A few bugs I've noticed over the last couple of days.

1) Some input documents cause Amaya to put two TITLE elements in the tree
(only one is allowed). This happens when the <head> tag is omitted (it's
optional according to the HTML4 DTDs). Example:

<!DOCTYPE HTML PUBLIC "-//W3C/DTD HTML 4.01//EN"
                      "http://www.w3.org/TR/REC-html401/strict.dtd">
<TITLE>Test 4</TITLE>

(Yes, this is part of an embryonic HTML test suite.)

2) The now-famous http://www.indigo.ie/egt/ still suffers from body-within-body
even though my earlier example now marks the inner <body> tag as invalid.

Bugs (1) and (2) seem to beg for html2thot.c to be restructured and made more
regular. I've been thinking about possible ways of doing this. I'm leaning
towards a table-driven way of checking which elements can be children of which
other elements and of deciding how to correct errors in the input. Sometimes
the child tag should be marked as invalid, in other cases the parent element
should be closed (and perhaps a similar one scheduled for insertion immediately
within the child), in other cases yet an element needs to be interposed (e.g.,
<TR> between a <TBODY> and a <TD>, or <HEAD> between <HTML> and <TITLE>). It
would be nice for some of this behaviour to be user-configurable. And in any
case, error messages should be generated for illegal input constructs even if
html2thot can turn them into something sensible.

3) Amaya allows <HR> immediately within <OL> and friends (where only <LI>
elements are allowed), but does not allow <FORM> immediately within <TR>
(also forbidden by the DTD but common in the real world where tables are
used for layout). I'm not sure which of the two is more annoying: it depends
on what I'm trying to do. I see this as one more argument for making the
strictness of syntax enforcement user-configurable.

4) Amaya seems to want to put <TFOOT> after <TBODY>, whereas the HTML spec
says it should come before the first <TBODY> (for the sake of incremental
rendering).

5) Speaking of which, incremental rendering causes far too many refreshes,
apparently even when the only parts of the document that are being rearranged
lie entirely outside the current viewport.

6) When editing a document, the horizontal scrollbar in the Structure view
doesn't pick up increases in the total width of the view (as may easily
occur when typing in a longish paragraph, as this will have no line breaks in
that view).

7) Amaya doesn't accept some perfectly legal values of the LANG attribute.
Try <ABBR LANG="la" TITLE="exempli gratia">e.g.</ABBR> for an example...
I think this one is easy to fix, but haven't done so yet.

8) The ancillary program "print" is prone to segmentation faults if the .PIV
file given as the main argument on the command line is missing. At least on
X11-based platforms; the MS Windows version is probably immune. I think it's
because "print" tries to post a message back to the Amaya window but TtDisplay
hasn't been initialised yet.

9) I've found it difficult to start a %phrase markup immediately within a
new TD element. If I type in a space, then delete it, things seem to work
better.

Received on Thursday, 6 January 2000 05:18:00 UTC