XML documents with DTDs in zip files

Hello,

I recently received a test case for a bug. The bug involved getting documents out of a zip file. I fixed the bug. In the course of fixing that bug, I noticed something. Imagine that you have a zip file that contains a document like this one:

  <!DOCTYPE book SYSTEM "book.dtd">
  <book>…</book>

I don’t think there’s any practical way to make that “book.dtd” reference resolve to the DTD in the zip file (assuming it’s there). If relative-to is provided, we change the base URI entirely. Even if it isn’t provided, we don’t make a URI against which you could resolve a relative URI to a different URI inside the zip file.[*]

This bites especially hard because even if you aren’t doing a (DTD) validating parse, you may still try to read the external subset. And that’s going to fail.

I wonder if we need some way for users to control whether or not the parser should read the external subset?

                                        Be seeing you,
                                          norm

[*] Since the user didn’t report *this* as a bug, I assume they have a catalog that resolves the DTD reference correctly, to some other, local DTD. 👍

--
Norm Tovey-Walsh <ndw@nwalsh.com>
https://norm.tovey-walsh.com/

> Talent hits a target no one else can hit; Genius hits a target no one
> else can see.--Arthur Schopenhauer

Received on Wednesday, 3 December 2025 18:00:07 UTC