PIFLE (was Re: Are PIs useful?)

> From: Alex Milowski <lex@www.copsol.com>

> Because processing instructions are "single use" I consider them to be a
> hack.  I don't find them useful in SGML.  IMHO, the use of processing 
> instructions in XML to convey encodings and other information is rather
> borderline but I understand why they are used for that purpose.

One reason PIs are not as useful as they should be is because they do not
have any standard syntax.  So they languish as second-class SGML markup;
they insert unportable syntaxes inside SGML documents, and vendors and
users have to write proprietary parsing routines.  Which means elements
must be used to do the job of any generic processing instructions.  

We need to treat PIs more as in-band instructions to any part of the
SGML/XML 
system, rather than just as the foot-in-the-door of the stinky
format-devil. I have 
proposed to ISO WG8 that:

1) all PIs should start with the identifier of their notation (e.g. "XML"),
so that
applications (and parsers--for pragmas--, and storage managers--for
encoding PIs--) 
can at least look at the first name in a PI and know whether the PI is of
interest or not
(rather than the PIs-embedded-in-marked-sections system, which requires a
different grove
for each different application, and so is a little wrong-headed);

and, further, that 

2) all PIs should allow attributes (and have entity dereferencing). So a PI
looks 
more like an element start-tag, where the notation name takes the GI's
position.

I would like XML to adopt PIFLE ("Processing Instructions, Formal, Like 
Elements"?) also, at least as far as requiring that the initial name token
in a 
processing instruction is a identifier of the notation for the PI.  I think
the general 
principle (which underlies much of XML, IMHO) that everything should be
self-labelling 
requires it. 

Without it, processing instructions are just "single-use", and may not
contribute much
value to XML.

(For any one interested, I am also proposing that marked sections can begin
with 
a notation name, and have attributes. So you could go <p>See character
<![Unicode radix="hex" script="Zh" [ABCD1234ABCD1234ABCD1234ABCD]]>, or
<![TeX [c=a+b]]>.</p>.  So element structure is not polluted by notation or
PI structures, 
if that will serve your system's purpose.  PIs, subdoc and marked sections
are currently 
if not stillborn then certainly failing to thrive in ISO 8879:1986)

Rick Jelliffe

Received on Wednesday, 14 May 1997 03:02:21 UTC