Re: Parsing methods

Lee Daniel Crocker (lcrocker@calweb.com)
Wed, 10 Jul 1996 15:10:10 -0700 (PDT)


Message-Id: <199607102210.PAA21069@web1.calweb.com>
Subject: Re: Parsing methods
To: www-html@w3.org
Date: Wed, 10 Jul 1996 15:10:10 -0700 (PDT)
From: "Lee Daniel Crocker" <lcrocker@calweb.com>
In-Reply-To: <m0ue75o-0002URC@beach.w3.org> from "Daniel W. Connolly" at Jul 10, 96 05:46:44 pm

> >IE: should the parser see
> >	<hello%^ myname=foo>
> >as a TAG that was messed up........
> >	OR
> >as plain text?
> >
> >i say as a messed up tag.....
> 
> And you'd be right.
> 
> If you want to be sure, check with a validating SGML parser.

As much as I would like to see producers use validation, and
as useful as general-purpose SGML is to unambiguous communication
of structured information, I must express a fundamental
disagreement with Dan and some other SGML-heads on how to handle
"invalid" SGML.

I think that  human-written text-based format like SGML _should
not have erors_, period.  I.e., the language should be a way
to interpret whatever the hell the writer throws at you.  If it
is clear and unambiguous HTML, great--interpret it that way and
go on.  If not, I believe a reader should try to be flexible,
and in most cases, just print questionable markup as is.  It is
far more useful for a reader to see something like &emdas; on
his screen when the author meant &emdash; than to see some
meaningless error.  And for a parser to throw up its hands and
refuse to parse <.. width=50%> rather than have some rules for
dealing with markup like this.

If that means two separate sets of rules for readers and writers,
then sobeit.