- From: Miles Sabin <miles@milessabin.com>
- Date: Sun, 13 Oct 2002 10:11:53 +0100
- To: www-tag@w3.org
Larry Masinter wrote, > It would be very bad if the web architecture REQUIRED or even > ENCOURAGED implementors of widely used media types to actually go off > and GET the namespace URI. So a web design that had browsers actually > trying to connect to www.w3.org and "GET /1999/xhtml" whenever they > tried to open an XHTML document ... well, that would be a bad design. I've raised this issue several times over the last few years, tho' wrt DTD external subset system ids rather than namespace URIs, and the response has always seemed to be either that the problem is with poor implementations or that caching/catalogs and content distribution are the answer. The problem here is that the poor (or maybe malicious) implementation is at the client end, so not under the control of the URI publisher. It certainly used to be the case that many off-the-shelf XML parsers would by default attempt to retrieve external subsets when validating, and many developers were unware of the need to change that default. We had an illustration of the consequences a while back when Netscape "lost" the RSS DTD (see http://www.oreillynet.com/cs/weblog/view/wlg/263) and lots of peoples feeds stopped working. Whilst this was an administrative mistake, from the POV of the clients, poorly implemented or not, it was indistiguishable from a server failure or denial of service. Even more ridiculous would be a use of XML for locally stored application configuration information where each read implictly involved network access thanks to an attempt to retrive DTD or namespace information. Aside from making the application unusable on disconnected machines, we could easily imagine the users of an increasingly popular product eventually slashdotting the vendor. And there's also a privacy issue: each retrieval could be construed as the application "phoning home". So I agree with Larry, there definitely is an architectural issue here. It's one thing to say that URIs SHOULD be retriev*able* it's quite another to say that they SHOULD be retriev*ed*. That said, as has been pointed out more than once, anything which is retrievable probably will be retrieved: whether or not it's retrieved often enough to cause problems is pretty much indeterminate. Cheers, Miles
Received on Sunday, 13 October 2002 05:12:30 UTC