silent recovery from errors considered harmful

I'm using Amaya more and more, so I find it increasingly
difficult to accept claims such as:

[[[
II.7. What is the reason for a very liberal structure enforcement?

Amaya has to cope with existing Web pages [...]

]]]

-- Amaya FAQ
http://www.w3.org/Amaya/User/FAQ.html#What
Wed, 12 Jul 2000 10:02:45 GMT


It's clear that Amaya doesn't cope with all
existing Web pages. It crashes more than occasionally.
That's completely understandable; I don't encourage
spending development effort trying to make sense
of erroneous Web pages.

But what I'm really concerned about is:

"[...] We [...] decided that
Amaya should try to fix bugs, but without losing
information."

Clearly that's impossible. You're losing information
about the former (possibly illegal/unsupported) syntax
of the document.

It's perhaps useful to attempt certain huristic fixes,
but not without the informed consent of the user.

I would suggest:

(a) When you load a document, use an XML processor.
If you detect a well-formedness error or if you
detect an element/attribute structure problem:

	(a1) display some "bad markup" icon in the UI

	(a2) call tidy to clean up the document.

	(a3) start over with the results from tidy,
	keeping track of the fact that you've tidied the input

	(be careful not to loop; the tidied output
	might still not be acceptable to Amaya)

(b) if the user starts to edit (i.e. when you set
the "dirty" flag and, e.g. change the save icon
from grey to active), if you're working with tidied
input, prompt the user with a modal dialog ala:

	This document had markup errors. Amaya has attempted
	to correct the errors, but there is no guarantee
	that the corrections are as you intended. You may
	want to review the source of the document.

		<Cancel edit>    <OK>

This approach has two benefits:
(1) users are informed when Amaya changes their documents
(2) Amaya developers can stop worrying about buggy
documents, and leave all error recovery issues to tidy.

If you want to deal with valid HTML 4.0 documents
(i.e. valid SGML documents) in (a), very well, but
there are so few of them them that frankly, I wouldn't
bother if I were you. But since Amaya itself can
produce valid HTML 4.0, perhaps it's worth supporting
for a time.


Also, regarding integrity,
please re-consider my earlier request to limit
(to a few minutes or so) the amount of my work that
Amaya can lose by crashing

auto-save periodically, not just at crash time
Dan Connolly (Fri, Mar 10 2000) 
http://lists.w3.org/Archives/Public/www-amaya/2000JanMar/0282.html

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Monday, 28 August 2000 15:44:47 UTC