Re: Element Structure for XML (Clause 7)

To: w3c-sgml-wg@w3.org
Subject: Re: Element Structure for XML (Clause 7)
From: paul@arbortext.com (Paul Grosso)
Date: Wed, 11 Sep 96 16:04:36 CDT
From w3c-sgml-wg-request@www10.w3.org Wed Sep 11 17: 14:45 1996


> To: w3c-sgml-wg@w3.org
> From: Martin Bryan <mtbryan@sgml.u-net.com>
> Subject: Element Structure for XML (Clause 7)
> Date: 	Tue, 10 Sep 1996 18:48:27 +0100
> 
> . . .
>
> 7.6 (4th bullet)
> PIs don't make sense on a network. Their use should be restricted to the
> prolog, where they could be useful in setting up the parser (e.g. to
> identify a relevant SGML declaration). The thought of allowing a processing
> instruction within a document instance that is going to be displayable by
> many different plug-ins scares the hell out of me.

I don't understand and I disagree with the anti-PI sentiment. 

What is so scary about PIs?  They're easy to recognize as PIs and easy to
parse.  And they're as simple as comments to handle:  ignore them (unless
you recognize them as being something you can and want to handle).

PIs are the only good way to encode information in the SGML document that
hasn't been pre-conceived of when the DTD was written, and there are lots
of good reasons why this might be important. 

For example, Adept uses PIs to:

1.  remember the position of the cursor when you last saved the file;
2.  remember the "detail status" (whether this "division type" element
    was set to view as "collapsed" or "expanded" last time the file was
    saved) of elements;
3.  insert certain instance-specific formatting information such as one-off
    inserted page breaks.

Note that there is nothing scary about ignoring any of these PIs, but if
such PIs are forbidden in XML, then there is no good way to encode this
information in the document.  

Let's remember that 95% of the time, a given document instance is going
to be read by the same tool that last wrote it.  Sure, interchange is
important, but allowing for user-friendly tools is too, and if we throw
out the only safe, unscary way to support tool-specific amenities, we
are really dooming ourselves to lowest common denominator standards of
useability.

Let's agree that it must be the case that parsing must be unchanged if
all PIs within the document are ignored, but don't forbid the existence
of PIs in document instances.

> 
> . . .
> 
> 8(c) 
> If PIs are retained then PIC should be changed so that it is not > as this
> is required for many processing instructions. (An alternative may be to
> allow a character reference to be entered within a PI, but this would make
> XML incompatible with SGML.)

I don't really understand what Martin is suggesting here.  I realize that
SGML's lack of reasonable escaping mechanisms (e.g., there is no way to 
put a PIC character in a PI) is problematic, but I don't see how changing
the PIC from that in the RCS would really help anything.  Most tools that
make use of PIs have figured out some why to handle the PIC problem, so 
why should we address the issue?

paul

Prev: Re: Entity structure for XML (Clause 6)
Next: Re: PIs
Index: Message index of w3c-sgml-wg@w3.org mailing list
Thread: Thread index of w3c-sgml-wg@w3.org mailing list