- From: Leif Halvard Silli <lhs@malform.no>
- Date: Wed, 22 Jul 2009 03:07:07 +0200
- To: HTMLWG <public-html@w3.org>
Summary: Please add back support for SGML processing instructions as
they are well supported and much used.
Suppose one wants to validate a page that embeds PHP, but that one wants
to do so /prior/ to the execution of the PHP script:
<?php Print "Hello, World!"; ?>
Validator.nu will then tell you that the page is _invalid_ as a HTML5
page. But that it is valid as a XHTML5 page ... Also, checking the HTML
5 draft, you will find nothing about the <?> syntax.
And this despite the fact that <?php ... ?> is valid both as a SGML/HTML
processing instruction [1][2] and as a XML/XHTML [3] processing
instruction. Thus the code is also valid as a HTML 4 or a XHTML 1
fragment. User agents (IE, Opera, Firefox, Webkit, Lynx etc) support
processing instructions ("PI") too: just as HTML 4 and XML require them
to do, they ignore them. In Firefox and Webkit, a PI simply disappear
(they are not reflected in the DOM at all). In IE they are treated as
comments. In Opera they are treated as either "unnamed" (<?>) or named
(e.g. <?php >, <?biferno >) comment nodes. See the Live DOM viewer
demo[4].
There is a slight difference between the minimal SGML/HTML syntax, which
is <? ... >, and the XML/XHTML syntax, which is <?instruction ?>. HTML
User Agents accepts the SGML syntax. However, as we can see from the
Hello World example above, PHP prefers the XML syntax ( "?>" ), probably
because it is valid both as XML and as HTML.
Due to HTML 5's failure to specify the <? > syntax (and due to XML
paranoia ... ?), Validator.nu currently throws an error, blaming XML, as
soon as it sees a "<?":
"Error: Saw <?. Probable cause: Attempt to use an XML processing
instruction in HTML. (XML processing instructions are not supported in
HTML.)"
PHP is not the only language that takes advantage of HTML's support for
processing instructions, however. Another example is Biferno[5], and
there most certainly are others. Of course, "real" XML processing
instructions is an example of its own - by allowing PI it would become
easier to create pages that are valid both as XHTML 5 and HTML 5.
If the HTML 5 spec ends up not defining the PI element, then one will
have to rely on the HTML 4 specifications for getting info on it.
Additionally, it gets difficult to use a HTML validator as an authoring
tool if it throws an error as soon as it sees an "<?". More
fundamentally, PI is one of the extension mechanisms of HTML 4 - not
including them in HTML 5 makes HTML less extendable than HTML 4 and
removes an important non-UA interface of HTML.
[1] http://www.w3.org/TR/html401/conform#h-4.2
[2] http://www.w3.org/TR/html401/appendix/notes#h-B.3.6
[3] http://www.w3.org/TR/REC-xml/#sec-pi
[4] http://software.hixie.ch/utilities/js/live-dom-viewer/saved/180
[5] http://www.tabasoft.it/biferno/index.bfr
--
Leif Halvard Silli
Received on Wednesday, 22 July 2009 01:07:49 UTC