- From: Raffaele Sena <raff@nuvomedia.com>
- Date: Thu, 17 Jun 1999 14:27:03 -0700
- To: "Allen Comer" <allen.comer@entropic.com>, <www-lib@w3.org>
> As a follow-up to yesterday's message. I seem to be running into more
> than a my fair share of problems and I'm wondering what I might be doing
> wrong. The latest problem I've found is that ampersands in plain text
> areas of an HTML document seem to confuse the HTML parser. There are no
> unclosed <form> tags anywhere nor is there anything else that looks
> potentially troublesome.
>
> Any suggestions or ideas would be appreciated.
>
Yap! If you check SGML.c you will see that an ampersand is always
considered to start an entity, valid or invalid.
One way to put it back in the text is register a callback for unparsed
entities.
This is a quick hack in the 'showtext' example
libwww/Library/Examples/showtext.c ).
It's still eating a white space after the ampersand, but you can get the
idea
(and I think the check for an isolated ampersand - i.e. x & y - should go
in SGML.c)
diff -r1.2 showtext.c
49a50,55
> PRIVATE void unparsedEntity (HText * text, const char * buf, int
len)
> {
> fputc('&', stdout);
> if (buf) fwrite(buf, 1, len, stdout);
> }
>
74a81
> HText_registerUnparsedEntityCallback(unparsedEntity);
-- Raffaele
Received on Thursday, 17 June 1999 17:27:02 UTC