Re: comments on the polyglot spec

Daniel Glazman writes:

> Le 19/05/10 10:56, Henri Sivonen a écrit :
> > String that look like processing instructions (including<?php ... ?>
> > ) are non-conforming in text/html and don't cause PI DOM nodes to be
> > created in text/html. Thus, the polyglot guide would have wrong if
> > it didn't say that PIs don't belong in the polyglot subset.
> >
> > As for<?php ... ?>  being bogus in text/html in the first place, the
> > HTML5 spec deals with what travels over the public network, so
> > server-side pre-processor syntaxes are out of scope. Thus editing
> > environments that want to preserve pre-processor syntax can't even
> > follow the HTML5 spec proper when it comes to that syntax.
> Henri... Nobody's editing html with php inside through a http pipe.
> It's local storage

Quite. So that's out of jurisdiction of any HTML spec. What matters --
so far as an HTML spec is concerned -- is that the output of processing
the PHP yields valid HTML. Any server-side processing is permitted. (The
HTML5 spec may say that it isn't valid to serve some unprocessed PHP
over the web while claiming it is HTML, but since nobody wishes to do
that, it isn't relevant.)

A PHP source file doesn't need to be valid HTML.

The same applies to files subject to any kind of server-side processing,
such as Smarty, C#, Apache server-side includes, or Perl's Template
Toolkit (which permits defining custom delimiters to denote templated
regions). It doesn't make sense to parse any of these as though they
were 'just' HTML.

For each of them you could specify a syntax which is 'on top' of HTML5,
permitting everything that HTML5 permits plus the special syntax of that
server-side language. Then an edtior would be creating documents to that
spec, not HTML5 itself. HTML5's 'other applicable specfications'
extension point would apply.

(However, such a syntax would be a subset of what is generally permitted
by most server-side languages, many of which operate on source text
rather than the Dom. For example, it is valid PHP to have a tag opened
by something like <?='<h1>'?> and closed by </h1>; processing that with
PHP will correctly result in balanced <h1> ... </h1> tags being sent to
the browser. But trying to read the raw PHP into a Dom will encounter
only the closing </h1>.

So it may be that actually defining server-side languages on top of
HTML5 is not permissive enough to be useful.)

> From a web author's perpective, the fact that a html instance
> containing a php PI won't be editable as a polyglot document is a
> severe limiting factor to the usefulness of polyglot documents.

Why would that be a limit? It could be edited just fine as 'a polyglot
document with PHP tags' or similar -- in what way would it inconvenience
a web author that such a file is not valid for serving raw to browsers?

And why do web authors need PHP special-casing -- what about all the
other server-side languages that web authors use, with syntaxes that
don't resemble PIs?



Received on Wednesday, 19 May 2010 09:48:07 UTC