Re: comments on the polyglot spec

Henri Sivonen, Wed, 19 May 2010 03:20:30 -0700 (PDT):
> "Daniel Glazman" <> wrote:
>> Le 19/05/10 10:56, Henri Sivonen a écrit :
>>> String that look like processing instructions (including<?php ...
>> ?>) are non-conforming in text/html and don't cause PI DOM nodes to be
>> created in text/html. Thus, the polyglot guide would have wrong if it
>> didn't say that PIs don't belong in the polyglot subset.
>>> As for<?php ... ?>  being bogus in text/html in the first place, the
>> HTML5 spec deals with what travels over the public network, so
>> server-side pre-processor syntaxes are out of scope. Thus editing
>> environments that want to preserve pre-processor syntax can't even
>> follow the HTML5 spec proper when it comes to that syntax.
>> Henri... Nobody's editing html with php inside through a http pipe.
>> It's local storage and your comment does not stand here.
>> From a web author's perpective, the fact that a html instance
>> containing a php PI won't be editable as a polyglot document 
>> is a severe limiting factor to the usefulness of polyglot documents.
> My point is that an HTML-only non-polyglot document edited using an 
> HTML-only DOM-based editor that implements parsing and serialization 
> per spec won't round-trip PHP. It's not a polyglot issue. Polyglot is 
> always more constrained than HTML-only.

As such, it could also be possible to define a subset of PIs that was 
possible to serve/edit both as text/html and as application/xhtml+xml.

> It's an issue of the PHP 
> processing layer outputting stuff that's in scope for HTML5 but the 
> PHP layer itself not being in scope. Furthermore, adding PHP (or JSP 
> or ASP) round tripping would be incompatible with legacy 
> Web-compatible text/html parsing, because bogus comments terminate on 
> the first > as opposed to ?> or %>.

At least Opera doesn't treat ?> and %> the same: the latter is 
rendered, while the former isn't. This is not about "PHP syntax" but 
about PI syntax.

> If an editor wants to pretend that a .php file is an HTML file with 
> PHP nodes in it when it really is a PHP program that might output an 
> HTML file when executed, the editor needs to patch its HTML parser 
> with a willful violation of the HTML5 spec (regardless of the 
> polyglot guide) and also adjust its serializer to recognize what the 
> parser emits.

I suppose that Daniel is thinking of a polyglot document being authored 
in application/xhtml+xml mode. Why shouldn't PIs be permitted in such a 
document? It is when it is parsed as a text/html document that PI 
eventually becomes PITA. Smyler just said: [1]

"(The HTML5 spec may say that it isn't valid to serve some unprocessed 
PHP over the web while claiming it is HTML, but since nobody wishes to 
do that, it isn't relevant.)"

And so, if one can have a XHTML based spec where PIs are allowed, this 
doesn't mean that these PIs turns up in served documents.


leif halvard silli

Received on Wednesday, 19 May 2010 10:58:00 UTC