- From: Brett Zamir <brettz9@yahoo.com>
- Date: Mon, 25 May 2009 13:35:57 +0800
Henri Sivonen wrote: > On May 18, 2009, at 11:50, Brett Zamir wrote: > >> Henri Sivonen wrote: >>> On May 18, 2009, at 09:36, Brett Zamir wrote: >> Also, as far as heavy server loads for frequent DTDs, entities could >> be deliberately not defined at a resolvable URL. > > There are existing XML doctypes out there with resolvable URIs, so > you'd need a blacklist to bootstrap such a solution. > As you suggest on your site, 'If, for legacy reasons, you must process some well-known DTDs, please make your entity resolver retrieve those DTDs from a local catalog." I would think the big browsers would be fully capable of doing this (as XML allows for by distinguishing public and system identifiers), and for any which exploded in popularity before obtaining a public identifier, I would imagine a blacklist could work. >> The same problems of denial-of-service could exist with stylesheet >> requests, script requests, etc. > > No, styles and scripts are commonly site-specific, so there isn't a > Web-wide single point of failure whose URI gets copied around as > boilerplate. > Well, again, as mentioned below, they can be of wider use, but I see your point that the effects on other sites would indeed most likely be stronger if the source site went down. While I think that's a risk they should be free to take (just as if people want to share or rely on external scripts), but if there's enough feeling against that, the issue could be addressed by requiring browsers to only access same domain. >> Even some sites, like Yahoo, have encouraged referring to their >> frequently accessed external files to take advantage of caching. > > At least the serving infrastructure for those URIs has been designed > for high load unlike the server for many existing DTD URIs out there. Again, I say either let them take the risk if they actually make a likely popular DTD to be available, allow a blacklist, or if need really be, limit to the same domain. > Furthermore, JS libraries have obvious functionality in existing > browsers, so it's unlikely that authors would reference JS libraries > as part of boilerplate without actually intending to take the perf hit > of loading the library. > Presumably most XML users will be including doctypes which include a public identifier. Use of lesser known XML dialects will probably presume some knowledge of what is happening, and even then, the official provider of the dialect, will probably know not to provide their DTD directly as a referenceable DTD. >> The spec could even insist on same-domain, though I don't see any >> need for that. > > Without same-origin (as in not even performing a CORS GET), you'd need > to blacklist at least w3.org due to existing references out there. Sounds fine, though I am assuming w3.org references already have a PUBLIC identifier for their DTDs. > (Note that for security, same-origin/CORS is must-have anyway.) > A must-have if you don't trust the origin, yes. But plenty of sites include scripts from other sites for ads or analysis. It would not be such a big loss in the case of DTDs to restrict to same domain, however. >> I also disagree with throwing our hands up in the air about character >> entities (or thinking that the (English-based) HTML ones are >> sufficient). > > That's a text input method issue that needs to be solved on the > authoring side for text input of all kind--not just text input for > writing XML in a text editor. > So, what's wrong with doing it in XML? If you're saying that text editors need to better support Unicode, then sure, but that's not a complete solution, given the cumbersomeness of finding obscure characters, etc. which can more simply be defined once in a DTD and forgotten. It's a nice feature for a text format which can be created across a variety of editors. >> Moreover, the browser with the largest market share offers such >> support already, and those who depend on it may already view other >> browsers not supporting the standard as "broken". > > IE doesn't support XHTML or SVG which are the popular XML formats one > might want to load into a browsing context. > Again, if there is an offline use, there is a browsing use. Just because not everyone is rushing to use XML in this way, does not mean that a lot of people would not like to share especially their document-centric XML in such a fashion (and even data-centric XML). Yes, a Firefox/Opera/Safari user who tries XHTML in IE will find it "broken", while a user of Firefox, etc. visiting an XML file dependent on an external DTD will find it broken. Firefox/Opera/Safari should be free to offer this positive feature to their users, even if IE doesn't come on board (to their eventual detriment I would think), while I would hope Firefox et al would implement this one feature on top of their already existing support for showing XML as a tree. As I said, IE is offering functionality which other browser users will think is broken in their browser--I think that is due to these browsers not having gone far enough, rather than IE having gone too far; just because the spec technically makes it optional, doesn't mean entity resolution for at least same-domain system-only-identified DTD's shouldn't become the de facto standard given the features it offers. >>> Loading same-origin DTDs for the purpose of localization is a >>> semi-defensible case, but it's a lot of complexity for a use case >>> that is way on the wrong side of 80/20 on the Web scale. >> How so? > > Localized sites are a minority on the Web, and chances that localized > Web apps would switch to a client-side localization method that relies > on server-side negotiation of the localization and requires XML to > work seem dim. > Maybe, but it is also very easy to use. I would hope browsers (and the specs guiding their collective behavior) could consider the convenience for document authors. Firefox developers, for example, are well familiar with them and some are eager to use them for remote XUL. I've seen an increasing number of .xhtml extension documents already out in the wild, despite a lack of support in IE, and despite such a change (without customizable external DTD's) offering arguably less benefits to the document creator than easy localization (though XHTML could also benefit from such DTD localization as well). >> Even if it is a niche group which uses TEI, Docbook, etc. or who >> wants to be able to build say a browser extension which can take >> advantage of their rich semantics, this is still a use for citizens >> of the web. > > If you need a browser extension for content, you shut out users of > browsers that don't have the particular extension available. It's like > using Flash. > While I agree that having to use an extension would limit the usefulness (that's why I'm so passionate about seeing browsers implement it), I'm talking about extensions that build interesting optional interfaces to that content--for example to perform an XQuery on the content (I've made a Firefox extension which does this) or to give a simple interface allowing users to highlight content or search only within special semantic tags (e.g., <date/>, <said/>, <bibl/>, etc. tags in TEI). But I very much agree that browsers should all implement the basic infrastructure: 1) XML tree for non-formatted XML, 2) CSS rendering of pure XML, 3) External DTD support, 4) Recognition of dialects like XHTML within larger XML fragments, and they're already almost there. Beyond this being about open technologies, it is also about being able to innovate. Even Flash can be supplanted over time by open standards, not to mention specialized languages with a much smaller audience. but using TEI isn't really going to break anything as long as you can at least load and view the document. Yes, there is a concern of babelization of semantics, but that is only a concern for document authors, and again I don't think XHTML can or should fill all semantic markup needs. >> If people can push forward with backwards-incompatible technologies >> like the video element, 3d-animation, or whatever, it seems not much >> to ask to support the humble external entity file... :) > > The upside of video and 3D is much more significant than the upside of > supporting external DTDs. > So animation is more important than Shakespeare? A lot of classical literature is richly encoded in XML languages like TEI. No doubt the readers of Shakespeare are fewer than those of video and 3d, but I don't think that means they are less important, especially when the implementation must, I would imagine, be quite a bit easier as well. >>> Besides, if the use case for DTDs is localization within an origin, >>> the server can perform the XML parse and reserialize into DTDless >>> XML. (That's how I've implemented this pattern in the past without >>> client-side support.) >>> >> That is assuming people are aware of scripting and have access to >> such resources. > > Localization with DTDs but without scripting is already tricky, since > one would need to tweak conneg. Sorry, I'm not aware what you mean here. JavaScript scripting could support cases of dynamic localization if DOM methods like document.createEntityReference() were implemented along with external DTD support. > >> Wasn't it one of the aims of the likes of XSL, XQuery, and XForms to >> use a syntax which doesn't require knowledge of an unrelated >> scripting language (and those are pretty complex examples unlike >> entities)? > > Web browsers don't support XSL-FO, XQuery or XForms. I for one hope they will. There seems to be a fair amount of interest in XForms at the very least. But my point is that it seems to be a W3C goal (and a good one) to make technologies which avoid a need for specialized scripting knowledge or services. > (XSLT support isn't something that can be generalized to feature > triage policy applicable to new features today.) > Sorry, I don't follow. >> (Btw, you and I discussed this before, though I didn't get a response >> from you to my last post: >> https://bugzilla.mozilla.org/show_bug.cgi?id=22942#c109 ; I don't >> mean to go off-topic but you might wish to consider or respond to >> some of its points as well...) > > Oh. I didn't make the connection. I didn't reply there, because using > Bugzilla as a discussion forum--particularly when the discussion turns > to advocacy--is frowned upon. I thought we were addressing rationales related to the legitimacy of implementing the bug, but all right. > Are there some particular points that I haven't addressed here that > you'd like to re-raise? > I think we're mostly rehashing it anyways. :) best wishes, Brett
Received on Sunday, 24 May 2009 22:35:57 UTC