- From: Leif Halvard Silli <lhs@malform.no>
- Date: Thu, 23 Jul 2009 02:06:07 +0200
- To: Thomas Broyer <t.broyer@ltgt.net>
- CC: HTMLWG <public-html@w3.org>
Thomas Broyer On 09-07-22 23.45: > On Wed, Jul 22, 2009 at 7:50 PM, Leif Halvard Silli wrote: >> What matters is that both <p> and <? > are supported with a >> predictable parsing in UAs [1]. > > The parsing is not predictable [2] Thanks for that demo. You are right - there are some glitches in IE and Webkit, as you said. The trouble seems to be related to presence of unpaired quote characters (" or ' or [in IE] `) inside the <?...> construct. (I.e. the ?> in the first paragraph, is outside the PI construct.) >> Is "bogus comment" used about other things than the (specified) effect of >> "<? comment >" ? In other words, is it anything but a negative word for the >> effect of <? comment > ? > > It is also triggered when "</" is followed by neither [A-Za-z>] or EOF > when the content model flag is set to the PCDATA state (§9.2.4.4), and > when "<!" is followed by neither "--", "DOCTYPE" (case-insensitive > match) or "[CDATA[" (case-sensitive match, only if the insertion mode > is "in foreign content" and the current node is not an element in the > HTML namespace) (§9.2.4.17). Thanks. So, it is not treated in its own right. >> The Live DOM viewer do not detect any comments for Firefox and Webkit. Is >> Live DOM Viewer wrong? Or do Firefox and Webkit not fulfill the spec yet? >> The way I read Live DOM Viewer, we have 3 different interpretation of <? > >> when we consider the result in the DOM (but one result if we consider result >> to the user). > > Actually, three results if we consider result to the user [1]. This can be easily fixed with a requirement to be careful with how one inserts quotes. >>> You'll note that in WebKit and IE, it ends at the "?>", not the first >>> ">" (even a "-->" wouldn't end the "bogus comment" in these UAs) >> May be you are colored by your attitude here: I am unable to verify your >> claim. All I see is that IE and Webkit - in text/html mode - ends the PI at >> the first ">". In other words, I don't see the behavior that you describe. >> E.g. see this Live DOM viewer demo [2]. > > See this Live DOM viewer demo [1] (compare the second and first > paragraphs, in WebKit; this sample doesn't demo this behavior in IE) Your demo [1] confirms that it is the unpaired quote character that is the problem, both in IE and in Webkit. Both IE and expects the PI to end at the first ">". However, the unpaired quote character means gets IE and Webkit to postpone looking for the ">", and send them on search for the pairing quote character instead. Thus, they do not, as I think you said somewhere earlier, prefer "?>" over ">". For instance, this explains the treatment of the 2nd and 3rd paragraph in IE. (Btw, please always include a <body> tag in such demos, or else the UA, especially IE, may place bits of the elements inside the <head> element.) >>> - HTML Tidy as explicit, limited support for ASP (<% %>), JSTE (<# >>> #>) and PHP (<?php ?> only, not the <? ?> syntax) >> Using e.g. an online version of TIDY [3], I am unable to confirm that it >> doesn't accept the <? ... ?> syntax. When configured to output XHTML, then >> it will correct <? ... > to <? ... ?>. Otherwise, it doesn't touch it. (But >> HTML Tidy is very configurable.) > > So it's a documentation omission (the doc only deals with the <?php > ... ?> syntax when talking about PHP) OK - I see. >>> Given that on the 4 main browser engines (Gecko, Trident, Presto, >>> WebKit), some parse it as a comment and others ignore it altogether >>> (and this depends on the content of the PHP code too: both IE and >>> WebKit seem to look for paired quotes with the <?php > construct); >> If you give an example, then perhaps I'll understand what you refer to >> w.r.t. IE and Webkit ... > > Compare the 1st and 2nd, and 3rd and 4th paras in [1] (in IE, beware, > the third <p> is actually parsed as part of the comment from the 2nd > paragraph, so the forth <p> ends up being the third paragraph in the > DOM). Yup - as noted above. >>> I don't understand how you could say there is any "UA support". >> Because you can insert <? > into your code and be certain that, as long as >> you do not place another ">" in between, then UAs will not render the >> content to the user, *and* they will parse them as the W3 validator does. > > Hopefully my simple example [1] proves it wrong. Unfortunately, IE and Webkit have a quote character bug, yes. How commonly one will get to experience this error, is another issue - usually one will pair one's quotes (except when one writes "one's", but then one should write "oneʼs" ...) It would be better if a validator eventually only warned whenever one failed to pair a quote, rather than the current error message for any presence of a PI. >> As for whether it is correct, according to HTML 4, to render <?...> as some >> kind of comment as Opera and IE do, or if it is correct to ignore them >> entirely, as Firefox/Webkit do, that I am not certain of. This is of the >> things that HTML 5 could specify. >> >>> (it seems like Opera 9.6 parse it as a ProcessingInstruction !?) >> Opera renders <?php > as a node named "php", and inserts the content as a >> comment, is that what you mean? > > No, I mean a ProcessingInstruction node [2] (also change it to end the > PI with "?>" and notice that there's no difference). Tested in Opera > 9.64. OK, interesting. >>> ...and HTML Tidy as explicit support for it besides HTML, as has been >>> suggested by others. >> Again, your interpretation of HTML Tidy seems here to be quite colored by >> your attitude to the issue - Tidy doesn't treat <?php ?> in any special way. >> You can even write <?whatever ... >. > > I was confused by the documentation. OK. >>> On Wed, Jul 22, 2009 at 1:48 PM, Leif Halvard Silli wrote: >>>> Anne van Kesteren On 09-07-22 12.53: >>>>> I agree with Simon that if you want stuff like this to work >>>>> dedicated editor support is needed (and there is to some >>>>> extent) and potentially modified validators. >>> +1 (see above, this is the case for HTML Tidy) >> >> Again, it isn't the case for HTML Tidy w.r.t. PI. (How it treats <% %> etc, >> is another issue.) And to repeat, once more: It is PHP and Biferno that use >> HTML syntax - not the other way around. Hence, future HTML specifications, >> such a HTML 5, are responsible for not breaking things that other languages >> and tools depends on. > > PHP is text-based (byte-based actually, unfortunately), not > HTML-based. W.r.t HTML it is a *pre*processor, there's no real > relation between PHP and HTML. The fact that PHP uses a PI-like > construct is to accommodate (some) existing tools I don't think this contradict my standpoint. It is also not clear to me how "real" the relationship between HTML and PI is in the HTML 4.01 spec. There is a relationship - whether "real" or not. > (w.r.t. XML, it > allows generating XHTML+PHP with XSLT using > <xsl:processing-instruction/> rather than <xsl:text > disable-output-escaping="yes" />) > ...but that's another debate... Certainly interesting to those that are into XSLT ... ! > [1] http://software.hixie.ch/utilities/js/live-dom-viewer/saved/182 > [2] http://software.hixie.ch/utilities/js/live-dom-viewer/saved/183 -- leif halvard silli
Received on Thursday, 23 July 2009 00:06:49 UTC