- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Thu, 18 Aug 2022 10:59:03 -0600
- To: Norm Tovey-Walsh <ndw@nwalsh.com>
- Cc: Michael Kay <mike@saxonica.com>, xmlschema-dev@w3.org, Gerald Oskoboiny <gerald@w3.org>
Norm Tovey-Walsh <ndw@nwalsh.com> writes: >> From: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com> >> Subject: Re: XML Schema validation and https redirects >> Date: 18 August 2022 at 01:12:54 BST >> To: Gerald Oskoboiny <gerald@w3.org> >> Cc: xmlschema-dev@w3.org >> Resent-From: xmlschema-dev@w3.org >> Gerald Oskoboiny <gerald@w3.org> writes: >> W3C's main web site https://www.w3.org/ will soon start to redirect >> all http requests to https. Will this cause issues for XML >> Schema-related resources hosted on www.w3.org? > Like Micheal (Sperberg-McQueen), I’m inclined to hedge my bets. What’s > actually going to happen? I like Norm's analysis of the possible outcomes; I have comments on a couple of them. > ... > 5. If the validator didn’t report an error because it failed to get an > XSD file, then it’ll proceed without the schema document. That > probably won’t work, but it’s a bit hard to predict how it’ll fail. For what it's worth, the XSD spec does explicitly say that it's not a validation error if a schema document cannot be retrieved. What should normally happen in that case is that the schema used for validation will lack declarations for some elements and attributes, which means in turn that errors in those elements and attributes will not be detected (so validation will be looser than expected), and the elements and attributes will be marked as having unknown validity. In principle, this should cause a warning flag of some kind for downstream consumers expecting to see valid input, but in practice many validators and users appear to take the absence of error messages as meaning the input is valid, failing to distinguish between validity="valid" and validity="notKnown". > 4d. If the API returns the schema document with the https: URI as the > system identifier, then… > > ... > > 5b. If the validator looks at the system identifier, I suppose some > part of the validator might decide that https:// doesn’t match > http:// and conclude that it has the wrong namespace. Anything is possible, of course, but it should be pointed out that there is no justification in the XSD spec for behavior 5b in an XSD validator. Behavior 5b might be plausible for an automated tool dereferencing a namespace URI. But the XSD spec is explicit that schema documents for a given namespace may reside anywhere. > I have no real intuition about how likely 5b is. My wild guess is “not > very likely” because once you’ve got the schema, you’re probably more > concerned about what targetNamespace it claims to validate than what its > URI was. That is also my guess. It is certainly what I think the XSD spec suggests. > I saw this one in the wild within the last year: (Some of the) XSD for > XSD Schemas have a doctype declaration, for example this one: > > http://www.w3.org/2001/XMLSchema.xsd > > I discovered some bit of software, I forget the exact details, that had > a cached copy of the XSD but not the DTD so parsing the cached XSD made > a DTD request to www.w3.org every time… Yow! Excellent point. > Yes, XML catalogs help. They allow the application author and/or user to > configure local resources that can be returned automatically when > attempts are made to retrieve documents over the web. Hear, hear. The XSD spec does not explicitly mention XML catalogs, but I read its discussion of how schema documents are to be found on the Web as compatible with catalogs and similar measures. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Thursday, 18 August 2022 17:12:11 UTC