Re: What problem is this task force trying to solve and why? from John Cowan on 2010-12-23 (public-html-xml@w3.org from December 2010)

From: John Cowan <cowan@mercury.ccil.org>
Date: Thu, 23 Dec 2010 01:02:23 -0500
To: Kurt Cagle <kurt.cagle@gmail.com>
Cc: David Carlisle <davidc@nag.co.uk>, public-html-xml@w3.org
Message-ID: <20101223060223.GC9191@mercury.ccil.org>

Kurt Cagle scripsit:

> Consider, for instance, the characteristics of a hypothetical lax XML parser

Yeeks.  What you are doing here, AFAICS, is trying to design a kludge.
By comparison, HTML parsing is an *evolved* kludge: it got to be the
way it is as a result of natural selection (more or less).  The trouble
with designing a kludge is, why this particular kludge and not one of any
number of possible closely related kludges?  For the normal application
of kludges as one-offs, this doesn't matter, but redesigning XML parsing
is anything but a one-off.

> As the parser works through these cases, it assigns a weight that
> indicates the likelihood that a given heuristic rule determines the
> correct configuration.

Based on what?  To do this in a sound way, you'd have to have a lot of
information about broken XML and what the creator *meant* to express
by it.  I don't know any source of that information.  Otherwise you are
not truly doing heuristics, but just guessing a priori about what kinds of
error-generating processes are more important and what are less important.

-- 
In my last lifetime,                            John Cowan
I believed in reincarnation;                    http://www.ccil.org/~cowan
in this lifetime,                               cowan@ccil.org
I don't.  --Thiagi

Received on Thursday, 23 December 2010 06:02:54 UTC