- From: David G. Durand <dgd@cs.bu.edu>
- Date: Tue, 14 Jan 1997 15:32:54 -0500
- To: w3c-sgml-wg@www10.w3.org
As I was going over the RMD stuff again, while thinking over the use of the subset to eliminat PIs, I had some thoughts that might let us simplify the RMD. I noticed that the RMD is required in certain places (ie. you _must_ rather than _should_ declare RMD to include any attribute defaults). I think that this places an onus on the user to declare things, but does not give much benefit: If the RMD is incorrect, a parser that doesn't need a DTD (the kind the RMD is intended to aid) will silently produce incorrect results. A parser that does use a DTD, (and does not need an RMD) cannot ignore it, but most check the locus of declaration of certain things so that it can issue an error message in case the RMD is incorrect. First, I think we should certainly simplify RMD to two values: INTERNAL, and ALL. If allow multiple attribute list declarations, this will let us solve the "default value problem" in a much more natural way. The RMD could even be left out, and an explicit note made in the standard that DTD-ignoring applications will not see default attribute values declared in the DTD. It will still be possible for DTD-parsing applications to print a warning if the user desires, and should simplify the author's life a little bit. Now for PIs. I've thought about the PI thing a lot, since some off-list mail with Michael where he pointed out the PIs are the best way that we have of adding declarations to XML, while retaining SGML compatibility. This point is completely correct, and given the PI keyword reservation discipline we have added in XML, this can be well controlled too, so I found myself at a loss as to my continuing discomfort with PIs, since I still believe that they are, at bottom, a gross hack. But being a gross hack is usually incompatible with being the best (as opposed to a workable) solution to a design problem. But I think I have resolved the cognitive dissonance in my own mind, and I'd like to suggest that some simple wording and production changes would make things nicer. PIs in SGML are a way to extend the capailities of SGMl processors in arbitrary ways. As such they are a kind of pandora's box that can be used to add arbitrary _non-structural_ markup to a document. As a true believer in content markup this disturbs me. XML needs the PI syntax, as that is the only way we can add declarations to XML and still remain SGML compatible. So, I would suggest that we split the PI into two constructs: an "extension declaration" and a "Processing Instruction". extension declarations can occur only in the external or internal subset. I would argue that user-declarations might not even be needed, but if people want them that is OK, I guess. A processing instruction is any <? ?> sequence tha occur in the document instance. Just separating the terminology makes the fact that there are two different functions much cleare. It is even clearer that arbitrary XML extension declarations cannot be part of the document, and that PI usage by users in the document instance is a completely different animal. So now that I've agreed that PI declarations are good, I'd like to suggest a change to the syntactic requirements (as opposed to their doucmentation). Let's _require_ the use of a declared notation in the document instance when using PIs. This one thing would make handling random PIs in the instance easier to conceptualize, and would make it much easier to be sure whether it's safe to ignore a PI or not. This will break old documents with TROFF commands wedged into them, but I think the additional structure would be worth it. So my suggestions are: A. make two productions for PI as declaration, and PI as inline instruction. B. require inline instructions (and user declarations if we want them) to declare a notation as well. [[[ This is more compatible with HyTime, as I understand the current stuff, and moves PI from kludge to "structured kludge." I can be happy with a structured kludge. ]]] C. RMD should go away, with a warning that some applications will not see delcarations outside the internal subset. It will no longer pe permissible for an application to ignore the subset, even if it only cares about well-formedness. D. We should allow multiple attribute declarations to ease the use of the subset for specifying default values. [[[ I think that this simplifies XML (and removes the need to explain the legality of the odd case where RMD=ALL when there is nothing declared). ]]] -- David I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Tuesday, 14 January 1997 15:25:50 UTC