- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Wed, 17 Aug 2022 18:12:54 -0600
- To: Gerald Oskoboiny <gerald@w3.org>
- Cc: xmlschema-dev@w3.org
Gerald Oskoboiny <gerald@w3.org> writes: > W3C's main web site https://www.w3.org/ will soon start to redirect > all http requests to https. Will this cause issues for XML > Schema-related resources hosted on www.w3.org? To this top-level question I have no reliable answer. It SHOULD not cause major issues; it probably WILL cause at least some issues, just because it's so easy to put things like this off until something actually breaks. I may be able to answer some of the questions of detail. One complication is that it's not clear how many of the schema validators in current use are actively maintained; if the change you are contemplating breaks validation in some tools, it may take a while before people figure out how to push an update. > We announced this intended change a few weeks ago, ... > Some questions I have: > Is it intended that www.w3.org is in the critical path when performing > XML Schema validation? Yes and no, at least in my reading of the XSD spec and my recollections of the WG discussions. Yes in the sense that the XSD spec and other W3C specs for which XSD and other schemas have been defined normally use dereferenceable URIs with www.w3.org as host to name the namespaces they define, and the explicit motivation for that usage was and is that it should be possible to retrieve information about such a namespace by dereferencing its name. No in the sense that the XSD spec is explicit that it is not required that a fresh copy of a schema document be retrieved from the host named in the namespace name. Several alternative methods of locating XSD schema documents are described in the spec. - Schema validators may have hard-coded knowledge of the schema they are built to work with. As a special case of this, knowledge of the XSD schema can be (and I expect probably is) built in to most schema validators, so that they don't have any pressing need to fetch a copy of the XSD schema for XSD schemas. - Schema documents are resources on the Web, to be dereferenced like any other resource, and no single strategy for retrieving them will work in all cases. Section 4.3.1 of XSD 1.1 Part 1 [1] says in part Note: The variations among server software and web site administration policies make it difficult to recommend any particular approach to retrieval requests intended to retrieve serialized ·schema documents·. An Accept header of application/xml, text/xml; q=0.9, */* is perhaps a reasonable starting point. [1] https://www.w3.org/TR/xmlschema11-1/#schema-repr As a special case of this, XSD schemas for any namespace may be cached by a local server or by a schema validator. It is also allowed for user-controlled caching (XML catalogs) to be used to point to local copies of XSD schema documents, but I do not know how widespread support for XML catalogs is among XSD processors. - One obvious approach to finding a schema for a particular namespace is to dereference the namespace name; this may or may not produce a schema. Section 4.3.2 of XSD 1.1 Part 1 [2] says in part it is possible but not guaranteed that a schema is retrievable via the namespace name. Accordingly whether a processor's default behavior is or is not to attempt such dereferencing, it must always provide for user-directed overriding of that default. [2] https://www.w3.org/TR/xmlschema11-1/#schema-loc - The user of a schema validation engine can provide a URI at which a suitable schema document can be found; this is formally a hint and processors are not obligated to attempt to dereference that URI. However, the schemaLocation information provided in a schema document when importing or including other schema documents is binding on the processor and not a hint. (More on this below.) > Are .xsd files and/or namespace documents > retrieved each time a validation is done? It would not surprise me if some validators operate that way; it is not required by the XSD spec. > Are there other use cases > besides validation that might cause automated requests to www.w3.org? Not common ones (at least, that I know of). > What are the most popular software packages that might be making these > requests to www.w3.org? In what contexts do they make these requests? > Do the latest versions typically have the ability to follow http to > https redirects? Would XML catalogs help? I can't help you there. > If we start redirecting http to https, will that fundamentally break > compliance with W3C RECs that specify http: in references to .xsd > files and namespaces? If so, which URIs would we need to continue to > serve via http? As far as I know, no spec that came out of the XML Activity ever requires namespace names to be dereferenced as a condition of conformance for any operation, so with respect to namespace names, the change you describe won't break conformance in any way that I can see. With respect to XSD schema documents and the XSD spec, there is one situation in which conformance may be held to require an attempt to dereference an http URI: namely, when a schema document refers, on an import or include or similar statement, an http URI, the spec says (as I read it) that the processor should fetch that schema document, which will normally happen by dereferencing that URI. The authoritative schema for XSD schema documents is currently hosted at http[s]://www.w3.org/2001/XMLSchema.xsd and imports the schema for the XML namespace http://www.w3.org/XML/1998/namespace by pointing to http://www.w3.org/2001/xml.xsd so I believe that conforming processors who haven't cached that document will continue hitting the http URI indefinitely. You could arrange to update the schemaLocation value in the XSD schema for schemas to use https, but that won't change the URI in any cached copies of the schema for schemas. It may possibly be helpful to continue to serve that schema document with http, but I do not believe this is a condition of conformance. Nothing in the spec says that it is non-conforming to follow a redirect, or for retrieval to fail. That doesn't mean you won't get complaints, but if I were you I would point them to the final paragraph of section 4.3.2 of XSD 1.1 Part 1: Improved or alternative conventions for Web interoperability can be standardized in the future without reopening this specification. For example, the W3C is currently considering initiatives to standardize the packaging of resources relating to particular documents and/or namespaces: this would be an addition to the mechanisms described here for layer 3. This architecture also facilitates innovation at layer 2: for example, it would be possible in the future to define an additional standard for the representation of schema components which allowed e.g. type definitions to be specified piece by piece, rather than all at once. The bottom-line meaning of that paragraph, as I understand it, is: the Web is a growing and changing system, and how you retrieve schemas may have to change to align with the Web. No conformance requirement in the XSD spec requires the Web to stop growing or changing. > Thanks, Thank you for your inquiry. I hope this helps. And good luck. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Thursday, 18 August 2022 01:31:42 UTC