- From: Marc Rubin, Jay's Island Software Development Solutions & Consulting <amayalist@mail.com>
- Date: Tue, 4 Dec 2001 08:06:55 -0500 (EST)
- To: www-amaya@w3.org, "Irene Vatton" <Irene.Vatton@inrialpes.fr>
In-Reply-To: <20011030113041.EE974C@maiana.inrialpes.fr> I've come up with two related fixes for the expat parser, following your suggestion, Irene, to trace the function EndOfAttributeValue () in the module Amaya/amaya/Xml2thot.c. One issue appears to be a bug in expat, so I'll present it first. The symptom is that multiple spaces are incorrectly preserved in attribute values. This line of code is clearly intended to suppress multiple spaces, in libwww\modules\expat\xmlparse\xmlparse.c line 2815, in function appendAttributeValue(). The logic gets short-circuited as currently written: if (!isCdata && (poolLength(pool) == 0 || poolLastChar(pool) == 0x20)) The second test above is skipped due to the expansion of the poolLength() and poolLastChar() preprocessor #defines. So two additional sets of parenteses are needed, as follows: if ((!isCdata && (poolLength(pool) == 0) || (poolLastChar(pool) == 0x20))) By adding the parenteses, multiple spaces are suppressed, as intended, when parsing the following sample input: <meta name="description" content="Software design & consulting for workstations, servers, & embedded firmware. Systems programming, quality-crafted applications and enhancements for Internet, Linux, Windows." /> If you would please test and verify this fix, then I'll post the related fix for preserving linefeeds. Thank you. Regards, Marc Previous thread: >From: Irene Vatton <Irene.Vatton@inrialpes.fr> >Sent: Tue, 30 Oct 2001 12:30:41 +0100 > >> Amaya 5.2 still discards line breaks in META tag content, so this update >> was apparently missed in the October release. > >We tried to change that behavior, but line breaks are discarded by the expat >parser. >... >You can check in the function EndOfAttributeValue () in the module >Amaya/amaya/Xml2thot.c
Received on Tuesday, 4 December 2001 08:07:46 UTC