- From: Gavin Nicol <gtn@ebt.com>
- Date: Tue, 10 Sep 1996 14:45:41 GMT
- To: tbray@textuality.com
- CC: w3c-sgml-wg@w3.org
>1. Document *data* is (mostly) for people to read, and thus of course > has to support the languages they write in. Document *markup* is > (mostly) for computer programs to read, plus the occasional unfortunate > document designer. Given that these things are already monocased, > and by industry habit that I doubt XML will break, short, it's not > clear that expressing GI's & attribute names in Cyrillic or Chinese is all > that important to the market. For document designers, my experience has been that about 50% or the Japanese people I talk to wish for Japanese markup. The people who are happy with ASCII markup, usually feel that it is better for interoperability. However, the people who want native language markup usually cite usability as the prime reason: it is much more understandable to have "bunsho" in a Japanese document, and in stylesheets, it becomes even more desireable, they say. For Japanese, it is not an overly large problem, because they have a phonetic spelling of Japanese that uses ASCII (romanji), but for other languages, ASCII phonetics as markup don't win. >2. Supporting bigger & more complex encodings in markup brings the benefit > of making life easier & friendlier for document designers who want to < use them. Restricting the markup character set down to 7 bits brings > the benefit of making it quicker & easier to generate software that > processes such markup. If I didn't already think that the second > of these two incompatible benefits was more important, I wouldn't > be working on XML. This is a fallacy. If you are going to support native language content, you will have to have some way of decoding the octet stream in order to correvtly parse the document (otherwise you run into problems with bits of character codes that could be mistaken for markup). If you have a decoding module on the stream (or a bit combination transformation filter), you will also be able to support native language markup.
Received on Tuesday, 10 September 1996 10:46:45 UTC