- From: Dan Brickley <danbri@danbri.org>
- Date: Tue, 7 Jan 2014 12:51:14 +0000
- To: Markus Lanthaler <markus.lanthaler@gmx.net>
- Cc: Sandro Hawke <sandro@hawke.org>, Gregg Kellogg <gregg@greggkellogg.net>, Dan Brickley <danbri@google.com>, Ramanathan Guha <guha@google.com>, W3C Web Schemas Task Force <public-vocabs@w3.org>, Linked JSON <public-linked-json@w3.org>
On 7 January 2014 10:16, Markus Lanthaler <markus.lanthaler@gmx.net> wrote: >> W3C's experience with XML parsers that auto-fetch >> http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd and >> http://www.w3.org/1999/xhtml when parsing XML is relevant here: > [...] >> >> If JSON is the new XML and JSON-LD is the emerging best practice for >> interoperable JSON, it isn't unreasonable to expect XML-levels of >> usage. So let's try to learn from the W3C XML DTD experience. > > I think there's a very important difference to that experience. XML namespaces are not links and are thus not *expected* to be dereferenced. Thus, AFAICT, for a long time those URLs returned non-cacheable HTTP error responses. If you know that a document is going to be requested often, you can plan for it (CDN, long cache validity etc.). I know it's important to keep these things in mind but I'm still not convinced that serving a small static file (even if it is requested millions of times) causes much costs. Otherwise, all the free JavaScript library CDNs etc. would have been shut down already a long time ago.. The main lesson from http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/ is DTD-related rather than schema-related. Schema-fetching is generally seen as more optional. The important difference is (I haven't found the exact spec reference but...) the XML 1.0 spec says that when parsing XML with validation enabled, external references to DTDs must be de-referenced. This is something that anyone learning-through-doing XML handling might not even think about, if they're using a spec-compliant XML parser. Many users of such libraries have no idea that their application code is hammering w3.org with repeated HTTP requests. Let's think about how we can help novice JSON-LD toolkit users find themselves in the same position. Perhaps the default behaviour of a JSON-LD toolkit / parser could keep a global fetches-per-minute count, and complain to STDERR if the application is over-fetching? (alongside sensible caching defaults etc) Dan
Received on Tuesday, 7 January 2014 12:51:45 UTC