Re: HTML and XML from Paul Prescod on 2009-02-17 (www-tag@w3.org from February 2009)

From: Paul Prescod <paul@prescod.net>
Date: Tue, 17 Feb 2009 08:52:00 -0800
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: elharo@metalab.unc.edu, www-tag@w3.org
Message-ID: <1cb725390902170852m16dbb360ve0b48326244d683f@mail.gmail.com>

On Tue, Feb 17, 2009 at 2:07 AM, Bijan Parsia <bparsia@cs.man.ac.uk> wrote:

> This is evidence that you are still confused about the matter under
> discussion. Just as an application that depends on its input conforming to a
> certain schema cannot meekly accept arbitrary well formed input, so too an
> application which depends on the data being wellformed (but, perhaps,
> described in prose) cannot blindly accept arbitrary byte streams as input
> (or at least as input on par with well-formed input). But many classes of
> application (and of user) need to have more elaborate handling of various
> classes of error. One way to improve the situation of such applications when
> using XML is to define the data model that results from parsing, as XML,
> arbitrary input streams.

It seems to me that this is a difficult task. Not AI, but a lot of work both
for the specifier and the programmer. Recall that a lot of the complexity
stripped from SGML was of this sort. "If this tag is missing then interpret
it as this, if that one is missing then interpret it as that." And SGML
never got close to parsing arbitrary input streams (although it did have
various other features which XML does not attempt).

I think it is a valuable thing to have such a specification, if someone is
ambitious enough to actually write it. Whether or not it becomes the default
parsing mode for web browsers, it could be used by various XML cleanup
tools.

So why not write the specification and THEN propose it as a standard? At
that point we would be able to talk less in terms of abstract philosophy and
more in terms of "In this case, such and such would happen". Given the
details of the specification it would be obvious that the AppleScript
analogy is off-base, and perhaps the SGML analogy would be proven invalid as
well.

 Paul Prescod

Received on Tuesday, 17 February 2009 16:53:47 UTC