- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Mon, 07 Apr 1997 15:37:49 +1000
- To: Peter Flynn <pflynn@curia.ucc.ie>
- CC: w3c-sgml-wg@w3.org
Peter Flynn wrote: > > At 09:24 04/04/97 -0800, Tim Bray wrote: > [...] > >in lots of different ways. But the characters must all be unicode-defined > >characters. A character reference 瘾 is a number, and that number > >is *always* a unicode/10646 number. > > Did we get rid of the &#u-HHHH; references? What happened to &#DDDD; > (or did I miss it)? > > ///Peter There never was a &#u-HHHH form, as far as I know. There was suggested * entity reference (from SPREAD public entity set) e.g. &U-HHHH; * hex numeric character reference (from Gavin's suggestion) e.g. &#xHHHH; Because XML uses ISO 10646 (regardless of the transmission character set or encodings used on the route from server to browser), there is no need to use an entity reference system: it would only duplicate the numeric character references. (But, a document that uses numeric character references for everything above U+00FF, and only uses ISO 8859-1 characters for markup, can still be processed on an 8-bit SGML system by preprocessing the hex numeric character references into the SPREAD entity references, e.g. sed "s/\&\#X/\&U\-/g" infile outfile I suppose.) A different approach is to make the hex numeric character reference start delimiter into "&U- Rick Jelliffe
Received on Tuesday, 8 April 1997 13:43:31 UTC