- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Sun, 26 Jan 1997 18:41:36 +1100
- To: Dave Peterson <davep@acm.org>
- CC: W3C SGML Working Group <w3c-sgml-wg@www10.w3.org>
Dave Peterson wrote: > The interesting question is: How do you deal with a legal SGML character > that your system has no internal representation for? I haven't thought > that one through. In some cases, I suspect your entity manager could > convert the character as stored in the storage representation into a > numeric character reference. But that can break down if the character > is used in markup. Hmmm.... ;-) That was the question that first drove the CJK DOCP's ERCS in 1995. Our solution: * have a public entity set containing all the characters from all registered national character sets (helpfully, this was already collected in ISO 10646) to allow full interconvertability of characters and entity references (i.e. the SPREAD public entity set), and therefore document-character-set -independence; * make recommendation about which characters should be used in native-language markup (i.e. the ERCS "greatest common denominator" rule, that markup characters only use the intersection set of the characters sets of all the systems the document will pass through: in Japan, this would be limited to the characters in Shift JIS, for example: JIS X 208 but not JIS 212, which still could be used in data); * make it explicit in the Extended Naming Rules (ENR) that there are 8 bit systems and there are 16 bit systems, and a 8-bit systems do not need to handle large-repertoire markup. The new public identifier in the SGML and system declaration (the one with "(ENR)" appended, see ISO 8879 Annex J) is there so that people won't use legal SGML characters in markup that a system mightn't have an internal representation for: at least as far as it applies to 8 bit ans 16 bit systems. -- Regards Rick Jelliffe email: ricko@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 262 4777 fax: +61 2 262 4774 _______________________________________________________________
Received on Sunday, 26 January 1997 02:38:22 UTC