Re: bug in XML::LibXML::Error 1.69_1: "int1, int2, int3" becomes "num1, num2, num3"

On Thursday 05 February 2009 22:51:26 olivier Thereaux wrote:
> Hi Petr,
...
> $error = bless( { 'num1' => 0, 'file' => '', 'message' => 'attributes
> construct error ', 'domain' => 1, 'level' => 3, 'str2' => undef,
> '_prev' => undef, 'str1' => undef, 'str3' => undef, 'num2' => 10,
> 'code' => 65, 'line' => 8 }, 'XML::LibXML::Error' );
>
...
>
> However, in the case of the error I just dumped above, I am surprised
> that int2/num2 seems to actually be wrong. Indeed, 'num2' => 10 where
> the "old" version of lbxml2 shows the error around column 15. Is that
> a bug too?
>
>
> My test case is
> http://qa-dev.w3.org/wmvs/HEAD/dev/tests/2689-attribute-no-space.xhtml feel
> free to use it, of course.
>
> Thanks,

So looking at libxml2 code (error.c) it seems that the generic handler does 
its own computation of the column number while selecting the context of the 
source XML to print (see xmlParserPrintFileContextInternal). The value passed 
to the structured handler is obtained from the input buffer and seems to be 
inaccurate/approximate. 

So a did what xmlParserPrintFileContextInternal does and added two fields:
$@->context() returns a string with the XML surrounding the error and 
$@->column() returns the offset in this string. This is now in SVN,
I'll make another developers release soon. These values are used in 
serialization code ("$@") so that now XML::LibXML hopefully produces the 
exact same error messages as xmllint.

Best,

-- Petr

Received on Friday, 6 February 2009 15:35:58 UTC