RE: Parsing methods -Reply from Erik Aronesty on 1996-07-10 (www-html@w3.org from July 1996)

From: Erik Aronesty <earonesty@montgomery.com>
Date: Wed, 10 Jul 1996 13:20:49 -0700
To: "'Jim Taylor'" <JHTaylor@videodiscovery.com>
Cc: "'www-html@w3.org'" <www-html@w3.org>
Message-ID: <c=US%a=_%p=Montgomery%l=EXCHANGE_SERVER-960710202049Z-26@sf-exch-2.montgomery.c>

the character entities should be handled at the "read next character"
level...so for the "tag parser" i wouldn't worry.
the letter thing i forgot about....but waiting until whitespace is
better than letting a % screw up the parse.

IE: should the parser see
	<hello%^ myname=foo>
as a TAG that was messed up........
	OR
as plain text?

i say as a messed up tag.....

>----------
>From: 	Jim Taylor[SMTP:JHTaylor@videodiscovery.com]
>Sent: 	Wednesday, July 10, 1996 4:44 PM
>To: 	www-html@w3.org
>Subject: 	Re: Parsing methods -Reply
>
>>>> Arnoud "Galactus" Engelfriet <galactus@stack.urc.tue.nl> 07/10/96
>10:41am >>>
>>In article <v0300780eae0923bca181@[205.149.180.135]>,
>>Walter Ian Kaye <boo@best.com> wrote:
>> straightforward -- what I'm looking for is how to parse the contents of
>a
>> tag: <ELEMENT attr1=abc attr2="def ghi" attr3="jkl" attr4=mno>.
>
>>Well, a simple algorithm to do this: Once you have found a "<"
>>character, the name of the element is everything up to the first
>whitespace
>>character or the ">" character. If you hit whitespace, you've got
>attributes coming.
>
>Close but no cigar. Element names must begin with a letter and be
>followed by letters, digits, periods, or hyphens. Just looking for
>whitespace is a bad thing. In other words, if I have text that reads
>"3<4
>but 4>2" the parser should pass it though unmodified, because "4" is
>not
>a valid element name.
>
>Also, information inside the <> is "parsed character data," meaning all
>character references ("&#34;", "&iacute;", etc.) should be decoded. For
>example, a tag such as <ELEMENT attr1=&#97;&#98;&#99;> is equivalent
>to <ELEMENT attr1=abc>.
>
>There are other things to watch out for. It's not as "straightforward"
>as
>you might hope, and probably no browser other than arena/amaya does
>it all right.
>
>______________________________________________
>Jim "The Frog" Taylor, Director of Information Technology
><mailto:jhtaylor@videodiscovery.com>
>Videodiscovery, Inc. - Multimedia Education for Science and Math
>Seattle, WA, 206-285-5400 <http://www.videodiscovery.com/vdyweb>
>
>
>
>
>

Received on Wednesday, 10 July 1996 16:27:18 UTC