Re: Iframe alternative content from Michael[tm] Smith on 2016-09-08 (www-validator@w3.org from September 2016)

From: Michael[tm] Smith <mike@w3.org>
Date: Thu, 8 Sep 2016 20:48:15 +0900
To: David Dorward <david@dorward.me.uk>
Cc: "www-validator@w3.org" <www-validator@w3.org>
Message-ID: <20160908114815.be3uof7h7f2ynbqa@sideshowbarker.net>
David Dorward <david@dorward.me.uk>, 2016-09-02 18:25 +0100:
> Archived-At: <http://www.w3.org/mid/FD07AD3A-EABE-46EE-8B9E-524900FA0427@dorward.me.uk>
> On 2 Sep 2016, at 18:14, Michael[tm] Smith wrote:
> ...
> > > The HTML parser treats markup inside iframe elements as text.
> > 
> > So if the HTML source of a document has `<iframe><p>Test</p></iframe>`,
> > an HTML parser puts all of that `<p>Test</p>` part into the DOM as text
> > —that is, there is no `p` element in the DOM for that.
> 
> That’s what the HTML parser does, but the spec makes additional requirements
> about how that text should be structured.

Yeah but I don’t think those requirements should still be in the spec at
this point to begin with. And to actually check them would necessitate more
implementation work than I personally care to spend for a requirement that
I think isn’t necessary or even helpful in practice.

That said, if somebody else wanted to take the time to submit a patch for
checking that requirement I would probably accept it.

What would be needed is a new condition for `iframe` elements added here:

  https://github.com/validator/validator/blob/master/src/nu/validator/checker/TextContentChecker.java

And what that would need to do is to have it feed the text content to the
HTML parser and get a parsed HTML tree back that would then need to be walked
and checked to make sure it doesn’t include any non-phrasing elements.

> Shouldn’t the validator report on
> violations of those rules as it does (for example) for invalid URLs (like
> `<p><a href="http://example .com/foo">test</a></p>`)?

That’s something completely different both in terms of the value it
provides to authors in catching mistakes that actually cause real problems
for users and also in terms of how it’s implemented in the checker.

> > Anyway, I think times the only reason the spec actually ever allowed any
> > content at all in `iframe` elements was for fallback in very old
> > browsers.
> 
> It is there as a fallback, but “very old browsers” is incorrect.
> 
> Lynx, for example, still makes use of it.

Yeah somebody other than me might take the opportunity here to have a
debate about whether Lynx is actually a browser to begin with

> ![](cid:111EAF68-E0EF-49E0-A4F7-FDA3F8B90591@dorward.me.uk "lynx.png")
> https://lists.w3.org/Archives/Public/www-validator/2016Sep/att-0002/01-part

So looking at that I see that the real problem there is that Lynx is failing to
do anything useful with that iframe.

What I mean is, in the image I see that Lynx already shows this:

  IFRAME: http://example.com

What Lynx should also be doing is, it should actually fetching the iframe
contents and trying to at least do something minimally useful with it—like,
at least just also showing the title of the http://example.com document in
addition to the URL.

So I don’t find that example very compelling at all as an argument for
providing support in the checker for checking the iframe-text-content
requirement in the HTML spec. Because in practice the vast majority of iframe
elements in real documents on the Web don’t have text content, so applications
like Lynx aren’t showing users anything useful for them except just the
URL. So if we really wanted to solve the problem in practice that Lynx
users actually have here, it would be better done by fixing Lynx.

> > Notice that for the case of HTML documents served with an XML mime type,
> > the spec says:
> > 
> > > The iframe element must be empty in XML documents.
> 
> Yes, I thought that was a very odd requirement (as was removing support for
> block elements as alternative content in the HTML serialisation).

Well to me what would seem a lot odder in the XML world is the current
requirement in the spec that implies an application needs to take text
content of an element an re-parse it as markup.

> > We should probably change the spec to say that same for text/html
> > documents—
> > because at this point I don’t think we have any people any more using
> > browsers that don’t have iframe support.
> 
> Lynx still has a loyal following. There are certain cases where I find it
> partially useful (such as when wanting to access a URL from another network
> to which I have an SSH session open).

Same here—I use Lynx relatively often (and elinks), for similar use cases.
But from that it doesn’t follow that I support keeping requirements in the
HTML forever if the only use case we can imagine is for web-pager
applications like Lynx—especially if the developers of those applications
are not already trying to even doing something minimally adequate with
iframe for their users.

  —Mike

-- 
Michael[tm] Smith https://people.w3.org/mike
Received on Thursday, 8 September 2016 11:48:41 UTC