- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Tue, 10 Dec 96 18:26:22 CST
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Tue, 10 Dec 1996 19:23:25 -0500 Terry Allen said: >As the XML spec says in relevant part: > > Users may extend the ISO 10646 character repertoire, in the rare cases > where this is necessary, by making use of the private use areas. > > ..., and as the Unicode Standard 2.0 says flatly that it defines no >way to do so, I think that ... XML must either define a way to use >the private use area(s?), or say that a future version of the spec >will do so, or say that users must make private agreements to do so. >Or rule out use of the private use area for the Internet (which is >effectively the case now). When the ERB decided not to include SDATA entities in XML 1.0, it placed the topic of non-Unicode characters, glyph identification, and documenting (or making) private agreements for character-set handling on the list of problems to be addressed in future revisions. A posting of 19 October 1996 put it this way: Techniques for dealing with non-Unicode characters, specification of glyphs rather than characters, and related topics (such as possible mechanisms for document private agreements governing the ISO 10646 Private Use Areas) will be addressed in future revisions. I think Jon meant simply that a fully worked out proposal is not our prime business, and could usefully be addressed elsewhere. I hope the SGML Open group Lee Quin is heading will give us something good to work from or adopt. The TEI is another place where discussions of this type might find a home, in the guise of discussing whether and how to revise the TEI Writing System Declaration. I would have no objection to adding a sentence to the spec to observe that using the private use area successfully requires out-of-band agreements between sender and recipient of files, or between user and software. But I thought that was pretty much clear from the start. I also have no objection to publishing, as a separate document, the list of topics we said we want to come back to in version 1.n or 2.0, though I don't think time-dependent information like that belongs in a specification. >And if it is truly contemplated that the private use area (rather than >SDATA entities) are to be used for the purpose under discussion, doesn't >the EBNF need to reflect that? I believe it does -- unless you mean that you think the characters in the private-use area should be allowed in generic identifiers. The EBNF doesn't reflect that, because I believe there is some consensus that private-use characters should be data, not markup. -C. M. Sperberg-McQueen
Received on Tuesday, 10 December 1996 19:47:20 UTC