- From: Michael[tm] Smith <mike@w3.org>
- Date: Thu, 8 Sep 2016 20:48:15 +0900
- To: David Dorward <david@dorward.me.uk>
- Cc: "www-validator@w3.org" <www-validator@w3.org>
- Message-ID: <20160908114815.be3uof7h7f2ynbqa@sideshowbarker.net>
David Dorward <david@dorward.me.uk>, 2016-09-02 18:25 +0100: > Archived-At: <http://www.w3.org/mid/FD07AD3A-EABE-46EE-8B9E-524900FA0427@dorward.me.uk> > On 2 Sep 2016, at 18:14, Michael[tm] Smith wrote: > ... > > > The HTML parser treats markup inside iframe elements as text. > > > > So if the HTML source of a document has `<iframe><p>Test</p></iframe>`, > > an HTML parser puts all of that `<p>Test</p>` part into the DOM as text > > —that is, there is no `p` element in the DOM for that. > > That’s what the HTML parser does, but the spec makes additional requirements > about how that text should be structured. Yeah but I don’t think those requirements should still be in the spec at this point to begin with. And to actually check them would necessitate more implementation work than I personally care to spend for a requirement that I think isn’t necessary or even helpful in practice. That said, if somebody else wanted to take the time to submit a patch for checking that requirement I would probably accept it. What would be needed is a new condition for `iframe` elements added here: https://github.com/validator/validator/blob/master/src/nu/validator/checker/TextContentChecker.java And what that would need to do is to have it feed the text content to the HTML parser and get a parsed HTML tree back that would then need to be walked and checked to make sure it doesn’t include any non-phrasing elements. > Shouldn’t the validator report on > violations of those rules as it does (for example) for invalid URLs (like > `<p><a href="http://example .com/foo">test</a></p>`)? That’s something completely different both in terms of the value it provides to authors in catching mistakes that actually cause real problems for users and also in terms of how it’s implemented in the checker. > > Anyway, I think times the only reason the spec actually ever allowed any > > content at all in `iframe` elements was for fallback in very old > > browsers. > > It is there as a fallback, but “very old browsers” is incorrect. > > Lynx, for example, still makes use of it. Yeah somebody other than me might take the opportunity here to have a debate about whether Lynx is actually a browser to begin with > ![](cid:111EAF68-E0EF-49E0-A4F7-FDA3F8B90591@dorward.me.uk "lynx.png") > https://lists.w3.org/Archives/Public/www-validator/2016Sep/att-0002/01-part So looking at that I see that the real problem there is that Lynx is failing to do anything useful with that iframe. What I mean is, in the image I see that Lynx already shows this: IFRAME: http://example.com What Lynx should also be doing is, it should actually fetching the iframe contents and trying to at least do something minimally useful with it—like, at least just also showing the title of the http://example.com document in addition to the URL. So I don’t find that example very compelling at all as an argument for providing support in the checker for checking the iframe-text-content requirement in the HTML spec. Because in practice the vast majority of iframe elements in real documents on the Web don’t have text content, so applications like Lynx aren’t showing users anything useful for them except just the URL. So if we really wanted to solve the problem in practice that Lynx users actually have here, it would be better done by fixing Lynx. > > Notice that for the case of HTML documents served with an XML mime type, > > the spec says: > > > > > The iframe element must be empty in XML documents. > > Yes, I thought that was a very odd requirement (as was removing support for > block elements as alternative content in the HTML serialisation). Well to me what would seem a lot odder in the XML world is the current requirement in the spec that implies an application needs to take text content of an element an re-parse it as markup. > > We should probably change the spec to say that same for text/html > > documents— > > because at this point I don’t think we have any people any more using > > browsers that don’t have iframe support. > > Lynx still has a loyal following. There are certain cases where I find it > partially useful (such as when wanting to access a URL from another network > to which I have an SSH session open). Same here—I use Lynx relatively often (and elinks), for similar use cases. But from that it doesn’t follow that I support keeping requirements in the HTML forever if the only use case we can imagine is for web-pager applications like Lynx—especially if the developers of those applications are not already trying to even doing something minimally adequate with iframe for their users. —Mike -- Michael[tm] Smith https://people.w3.org/mike
Received on Thursday, 8 September 2016 11:48:41 UTC