- From: Albert Lunde <Albert-Lunde@nwu.edu>
- Date: Sat, 7 Dec 1996 00:11:49 -0600 (CST)
- To: www-international@w3.org
> Say, I want to have a text which will contain the following fragment > on one line: > ---------------------------------------------------------- > .. The German word ko"se becomes froma`ge in French, but sy'r > in Czech yet ... in .... > --------------------------------------------------------------------- > > > > With current system of static character sets I would need a charset > which combines all > Latin-1 and Latin-2 and ... > > But, if you reserve ONE special character or tag or even just an > attribute for this, I can write this one line like this: You can do something functionally equivalent by using HTML as specified with HTML 2.0 + the HTML internationalization spec. If you want to send the whole thing in US-ASCII, use numeric character references (which refer to ISO-10646), or if you don't like that, use the UTF8 encoding of ISO-10646. To solve the problem of giving font hints to software use the LANG attribute. ISO-10646 _is_ a character set which includes all the glyphs/characters of Latin-1, Latin-2, JIS, etc. But, it's my reading of the HTML specs that it's possible to follow the internationalization spec without providing graphic representations of every character in ISO-10646 ... just keep a mapping to/from ISO-10646 for the glyphs that you have in the fonts available. I quote: "With the document character set being the full ISO 10646, the possi- bility that a character cannot be displayed due to lack of appropri- ate resources (fonts) cannot be avoided. Because there are many dif- ferent things that can be done in such a case, this document does not prescribe any specific behaviour" ... (it offers suggestions) Operating in this mode will do the things you want without reinventing the wheel, and will scale upward better. (Unfortunately, it's not widely supported, yet, which is what started this thread.) -- Albert Lunde Albert-Lunde@nwu.edu
Received on Saturday, 7 December 1996 01:11:37 UTC