[Prev][Next][Index][Thread]

Alternate SGML.c module in wwwlib 5.0a...?



Yet another fix to my version of "simplified HTML tokenizer". When
attempting to make the parser recognize SCRIPT/STYLE tags, I noticed
that the parsing of CDATA content (S_LITERAL) was not working. The
"CDATA mode" got terminated on any endtag. The following diff should
fix this (basicly, the termination kludge setting context->contents
was in a wrong place...)




*** SGML.c~	Mon Jan 20 16:57:07 1997
--- SGML.c	Mon Jan 20 16:59:12 1997
***************
*** 428,435 ****
--- 428,443 ----
  				/* If complete match, end literal */
  				if ((c == '>') &&
  				    (!context->current_tag->name[string->size-2]))
+ 				    {
  					end_element
  						(context,context->current_tag);
+ 					/*
+ 					  ...setting SGML_MIXED below is a
+ 					  bit of kludge, but a good guess that
+ 					  currently works, anything other than
+ 					  SGML_LITERAL would work... -- msa */
+ 					context->contents = SGML_MIXED;
+ 				    }
  				else
  				    {
  					/* If Mismatch: recover string. */
***************
*** 437,448 ****
  					PUTB(string->data, string->size);
  				    }
  				context->state = S_text;
- 				/*
- 				  ...setting SGML_MIXED below is a bit of
- 				  kludge, but a good guess that currently
- 				  works, anything other than SGML_LITERAL
- 				  would work... -- msa */
- 				context->contents = SGML_MIXED;
  				text = b;
  				count = 0;
  			    }
--- 445,450 ----



--
Markku Savela (msa@hemuli.tte.vtt.fi),     Technical Research Centre of Finland
Multimedia Systems, P.O.Box 1203,FIN-02044 VTT,http://www.vtt.fi/tte/staff/msa/


Follow-Ups: References: