Re: Parsing methods (fwd)

Daniel W. Connolly (
Wed, 10 Jul 1996 17:43:06 -0400

Message-Id: <>
To: Lee Daniel Crocker <>
Subject: Re: Parsing methods (fwd) 
In-reply-to: Your message of "Wed, 10 Jul 1996 12:22:34 MST."
Date: Wed, 10 Jul 1996 17:43:06 -0400
From: "Daniel W. Connolly" <>

In message <>, "Lee Daniel Crocker" writes
>> Well, a simple algorithm to do this: Once you have found a "<"
>> character, the name of the element is everything up to the first whitespace
>> character or the ">" character. If you hit whitespace, you've got
>> attributes coming.
>Simple, but not quite correct.  Don't forget that you have to
>check for <!, and attribute names have a very limited character
>set-- <tag-name2> is a tag, but <fake,tag*> is not, and should
>just be printed as plain text.

<fake,tag*> is an error. printing it as plain text is not the
best service you can provide the user, and certainly not the author.

This is documented at:


The following examples are errors: 

<xyz!> <abc/>
</xxx/> <xyz&def> <abc_def>

>Check out a good SGML reference.

Good advice. The obove draft isn't an SGML reference per se,
but SGML references aren't freely available. I worked hard
to be sure that the above draft matches the SMGL specs.
Evidence to the contrary (i.e. a bug report) is always welcome!

>  Never rely on browsers--
>especially ones as broken as Netscape--to tell you what valid
>HTML looks like.

And never rely on folks that don't cite sources ;-)