RE: Fwd: I18N Last call comments on Schema Part 2

From: Larry Masinter <masinter@attlabs.att.com>
Date: Fri, 14 Jul 2000 18:45:36 -0700
To: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, "Martin J. Duerst" <duerst@w3.org>, <www-xml-schema-comments@w3.org>
Message-ID: <NDBBKEBDLFENBJCGFOIJIEEPDBAA.masinter@attlabs.att.com>
Re coding systems for strings:

I don't understand your lack of sympathy. Every 'string' in a every
programming language I know potentially can contain control characters
that are excluded from XML. (Actually, I can think of *one* programming
language, MOO, that at one time didn't allow control characters in
strings). Many database systems are set up so that most strings allow
control characters in them that are excluded from XML.

What a schema language *could* do would be to invent or select a
public convention for including such characters. For example, 
RFCs 2047 and 2231 include ugly encodings that would work for this
purpose.  Perhaps you can do better,  but it seems unreasonable to
do nothing. The problem is that a program should be able to RELIABLY
translate from internal strings to XML data, without having to raise
exceptions about 'unencodable string character'.
