Re: B.1 and B.2 results from Gavin Nicol on 1996-10-22 (w3c-sgml-wg@w3.org from October 1996)

From: Gavin Nicol <gtn@ebt.com>
Date: Tue, 22 Oct 1996 17:53:12 -0400
To: dgd@cs.bu.edu
CC: U35395@UICVM.UIC.EDU, w3c-sgml-wg@w3.org
Message-Id: <199610222153.RAA02839@nathaniel.ebt>

>You cannot recognize the PI, _without having a list of the magic numbers
>for legal PI definitions_. If a user attempts to use a PI that does not
>exactly match one of the "the magic number formulas," then the processor
>may not even be able to recognize that a PI was present. So the apparent
>_self-descriptive_ aspect of the data is _not_ there. 

Thank you David. This is a point I have felt, but been unable to
articulate.  

>This is true only for all the character sets that _we precode into XML_. It
>does not work for any new character set names. The PI looks like it has a
>parameter, but in fact the PI, and its parameter, constitute a magic string
>of bytes with no internal structure. This is a bit counterintuitive.

As is explaining to people that you can do:

   <?XML-ENCODING "SHIFT-JIS">
   .....

but not

  <?XML-ENCODING "SHIFT-JIS">
  ....
  <?XML-ENCODING "UCS2">
  ....

>I do not advocate losing the notion. But if it gets intolerable enough,
>maybe we can do the right thing after all!

I agree with the notion of keeping "in-file" labels, but simply 
cannot accept "in-data" labelling, or anything that pretend to
be like it.

Received on Tuesday, 22 October 1996 17:55:04 UTC