Re: I think I broke CatalogResolver: what is it?

I think I figured it out, and boy was it a bear.

Firstly, I updated to the latest XML Commons Resolver I could find,
but that didn't help.

It seems that while CatalogResolver is successfully resolving some DTD
refs to the local copy, it isn't resolving all of them. Hence I think
this kind of happened to work in the past since it was getting copies
from www.w3.org remotely, and now that's failing intermittently.

After a lot of debugging, I discovered that for entities referenced
from DTDs (e.g. not directly from documents being parsed), the
CatalogResolver gets a system ID like "file://..." and not the
original system ID. Something's helpfully rewritten it presumably. The
CatalogResolver doesn't recognize it, gives up, and something else in
the code resorts to trying the original system ID, a remote URL.

Well, I tacked on another class called ExtendedCatalogResolver which
will (I think properly) recognize this system ID and return a stream
from the referenced file.

>From there I found that we were missing some additional XHTML Basic
1.1 .mod files and added those. And that we need to set the resolver
too when parsing.

After all that it seems to work for me even with my network cable
unplugged. CVS update and see if it works, anyone?

Sean

On 9/21/07, Sean Owen <srowen@google.com> wrote:
> Well scratch that, I figured out what CatalogResolver is -- it is part
> of XML Commons too.
>
> It seems to be OK -- does load the DTD catalog and all that. But
> somehow, all of the sudden, it hangs on loading something like a .mod
> file *remotely* from www.w3.org. That doesn't make sense, and it's a
> different file every time. I suspect some kind of thread-safety issue?
>
> Has anyone seen this -- does "ant test" work for anyone now?

Received on Saturday, 22 September 2007 00:14:56 UTC