Re: SDATA, again

Michael writes:
| When the ERB decided not to include SDATA entities in XML 1.0, it
| placed the topic of non-Unicode characters, glyph identification,
| and documenting (or making) private agreements for character-set
| handling on the list of problems to be addressed in future revisions.

Fair enough.

| I think Jon meant simply that a fully worked out proposal is not our
| prime business, and could usefully be addressed elsewhere.  I hope

Perhaps, but what he said is that it's ANSI's business, and the
existing Unicode spec does not inspire confidence that they consider
it so.

| the SGML Open group Lee Quin is heading will give us something good
| to work from or adopt.  The TEI is another place where discussions of
| this type might find a home, in the guise of discussing whether and
| how to revise the TEI Writing System Declaration.

Okay.

| I would have no objection to adding a sentence to the spec to observe
| that using the private use area successfully requires out-of-band
| agreements between sender and recipient of files, or between user and
| software.  But I thought that was pretty much clear from the start.

Only if you have read the Unicode spec; the (ahem) naive reader of
XML will assume that some mechanism already exists.

| I also have no objection to publishing, as a separate document, the list
| of topics we said we want to come back to in version 1.n or 2.0, though
| I don't think time-dependent information like that belongs in a
| specification.

Surely a list of what is deferred to future consideration is appropriate?
It's not properly time-dependent but version-dependent (except in
the sense that all things are time-dependent).  But so far as I
can see, this is the only point on which I'd like such a comment.

| >And if it is truly contemplated that the private use area (rather than
| >SDATA entities) are to be used for the purpose under discussion, doesn't
| >the EBNF need to reflect that?
| 
| I believe it does -- unless you mean that you think the characters in
| the private-use area should be allowed in generic identifiers.  The

Goodness no. 

| EBNF doesn't reflect that, because I believe there is some consensus
| that private-use characters should be data, not markup.

I'm tempted to ask, which private-use characters?  But won't.


Regards,
    Terry Allen    Fujitsu Software Corp.    tallen@fsc.fujitsu.com
"In going on with these experiments, how many pretty systems do we build,
 which we soon find outselves obliged to destroy?" - Benjamin Franklin
  A Davenport Group Sponsor:  http://www.ora.com/davenport/index.html

Received on Tuesday, 10 December 1996 20:07:04 UTC