W3C home > Mailing lists > Public > www-validator@w3.org > May 2007

Re: Problems validating XML

From: Shane McCarron <shane@aptest.com>
Date: Wed, 30 May 2007 20:21:45 -0500
Message-ID: <465E2329.6060503@aptest.com>
To: olivier Thereaux <ot@w3.org>
CC: Martin Duerst <duerst@it.aoyama.ac.jp>, www-validator@w3.org

It would be trivial to add a checkbox to the current tree that means 
"use the XML parser for this".  I can provide a patch for that if you like.

olivier Thereaux wrote:
>
> Hi Martin,
>
> On May 30, 2007, at 18:22 , Martin Duerst wrote:
>>> you want to submit patches to make it
>>> better in this regard, without being detrimental to its main job,
>>
>> I can definitely submit a patch that goes into XML mode if an
>> XML declaration is present. I don't consider this as being
>> detrimental to the validator's job, quite to the contrary.
>> If that's not what you mean, please tell me.
>
> I meant that in a general way. I don't think that adding a trigger for 
> XML mode if the xml declaration is present is a bad thing - it does 
> look sane. The discussions about XML detection/triggering, which I was 
> mentioning in my previous message were the following two bugzilla 
> entries:
> XHTML Detection is over-eager [Bug 14]
> XHTML-sent-as-text/html is parsed as XML [Bug 1500]
>
> the latter has been made INVALID by a clarification from the XHTML 
> working group, and I don't think the former is actually valid, but 
> it's raising interesting questions relevant to this discussion.
> [Bug 14] http://www.w3.org/Bugs/Public/show_bug.cgi?id=14
> [Bug 1500] http://www.w3.org/Bugs/Public/show_bug.cgi?id=1500
>
>>> I believe you're familiar with the code,
>>
>> Well, that was quite some time ago, and a lot of work has
>> gone into the validator since, but to some extent, yes.
>
> I think the code has indeed changed quite a bit since you last touched 
> it, but its structure should be familiar.
> A few of us here on the list can answer questions, too.
>
>>> We don't do relative SIs. Yet.
>>> http://www.w3.org/Bugs/Public/show_bug.cgi?id=1521
>>
>> If that can be handled in the validator code, I'll try to
>> submit a patch. But it might take a while.
>
> It would be great if you can look into it, but I believe this is a 
> tricky one. Our parser does not validate online documents, but rather 
> retrieves them before performing a local validation of the document's 
> string. In this context, making the validator aware of relative URIs 
> for system identifiers isn't trivial, you'd have to modify the Doctype 
> declaration on the fly to add the URI base. Alternatively a patch to 
> opensp to tell it "here is the URI base you should use to dereference 
> relative SYSTEM URIs" could do the job, but I am not familiar enough 
> with its code to tell how hard it would be.
>
>>> The charset override was broken in the 0.8.0 beta1. It is now fixed.
>>
>> This would probably explain things, see above.
>> Is there a plan to release a beta2?
>
> Absolutely. Crossing fingers to have it out by the end of the week. In 
> the meantime, the CVS HEAD version on qa-dev.w3.org should 
> systematically have the latest running (or broken, as it happens) 
> code, if you need to check for recent changes.
>
> Thanks!

-- 
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: shane@aptest.com
Received on Thursday, 31 May 2007 01:22:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:24 GMT