Re: Parsing methods (fwd)

In message <199607101922.MAA27470@web1.calweb.com>, "Lee Daniel Crocker" writes
:
>> Well, a simple algorithm to do this: Once you have found a "<"
>> character, the name of the element is everything up to the first whitespace
>> character or the ">" character. If you hit whitespace, you've got
>> attributes coming.
>
>Simple, but not quite correct.  Don't forget that you have to
>check for <!, and attribute names have a very limited character
>set-- <tag-name2> is a tag, but <fake,tag*> is not, and should
>just be printed as plain text.

<fake,tag*> is an error. printing it as plain text is not the
best service you can provide the user, and certainly not the author.

This is documented at:

================
http://www.w3.org/pub/WWW/TR/WD-sgml-lex/

The following examples are errors: 

<xyz!> <abc/>
</xxx/> <xyz&def> <abc_def>
================


>Check out a good SGML reference.

Good advice. The obove draft isn't an SGML reference per se,
but SGML references aren't freely available. I worked hard
to be sure that the above draft matches the SMGL specs.
Evidence to the contrary (i.e. a bug report) is always welcome!

>  Never rely on browsers--
>especially ones as broken as Netscape--to tell you what valid
>HTML looks like.

And never rely on folks that don't cite sources ;-)

Dan

Received on Wednesday, 10 July 1996 17:43:00 UTC