- From: Joe English <jenglish@crl.com>
- Date: Fri, 28 Mar 1997 17:11:00 -0800
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU> wrote: > Alternate character encodings [...] are a good example. > Note which of the following three positions we took: > > 1 fixed encoding: everything in UTF-8 (alternate form: everything > in UCS-2) > 2 anarchy: no requirements, processors do what they like > 3 required minimum which all processors must support; nothing to > prevent them from supporting more. That is a good example. Along those lines, I believe that the required minimum support for PUBLIC identifiers should be: "XML processors must supply to the application the PUBLIC ID of any entities for which one is specified (including the DTD entity)." That's it. No attempt at resolution is required, although processors are of course free to do so if they like. Remember that with <?XML RMD=NONE?> and <?XML RMD=INTERNAL?> there is no need to retrieve the DTD external subset. That's why "no resolution mechanism at all" is perfectly acceptable as a minimum requirement: it would let people put PUBLIC IDs in <!DOCTYPE...> declarations, which is where they are most useful. > If I know that *the processor(s) > I care about* support the encoding I like, I can use it. If I > care about having *all* processors be able to read my docs, I > know what to do, and I can ensure that all processors can handle > my docs. The same holds for PUBLIC IDs. If I want to ensure that all processors can handle my documents, I'll use RMD=NONE or a SYSTEM ID. If I know that the particular processors I care about can resolve the public identifier -- either because they support automatic FPI resolution or because they have a built-in knowledge of the FPI in question, for example a special-purpose CML browser -- that's OK too. > I have a question for the WG members who want PUBLIC to be syntactically > legal, but don't want to specify even a non-exclusive minimum > resolution method. Serious question, not sarcastic (despite the > tone). Why bother? It would allow XML users to supply a name for their document types, instead of just an address where a copy of the formal part of the DTD may be found. The former is useful; with RMD=NONE, the latter is not. Why is an FPI useful? Well, it would let (say) a special-purpose CML or CDF browser know that a document is, in fact, marked up as "Chemical Markup Language" or "Channel Data Format" and not some other DTD that happens to have the same initials. It would allow browsers to recognize that the stylesheet for "-//Davenport//DTD DocBook XMLified//EN" that they downloaded from the O'Reilly website will also work with DocBook XML documents on Sun's website. It would let people move a collection of XML documents from one host to another without having to change all the <!DOCTYPE...> declarations (which would be especially annoying knowing that most browsers are never going to use that SYSTEM ID anyway). Allowing PUBLIC IDs for all entities, not just DTDs, would make it possible to mirror and maintain large collections of XML documents by updating SOCAT files. True, such collections would only be usable by XML processors that have SOCAT support, but for publishers like Sun and the Linux Documentation Project that tradeoff is probably worth it. > If PUBLIC is not allowed, I agree life will be tough for you. You'll > have lots of non-conforming data that would be conforming if only PUBLIC > were legal. You'll have to write your own software to support XML. On > the plus side, you'll be able to write in whatever resolution mechanism > you like, including the ever popular sequential-search-through-the-Web > and the shell-out-and-search-altavista-for-the-FPI method. On the minus > side, you'll have to do it yourself. So I agree, life will be hard, > though perhaps still worth living. > > What I don't understand is this. How would this picture change if > PUBLIC were legal in XML, without a standard resolution mechanism? For starters, all our non-conforming data that uses PUBLIC would be conforming :-) Implementors will have to do a tiny bit more work (to parse and pass on PUBLIC IDs). Those who feel that PUBLIC IDs are worthless without a resolution mechanism can still use SYSTEM. Those of us who use PUBLIC IDs will not be forced to remove important information from our documents in order to make them XML-conformant. Who loses here? --Joe English jenglish@crl.com
Received on Friday, 28 March 1997 20:14:09 UTC