[Bug 12235] Make <xmp> conforming

https://www.w3.org/Bugs/Public/show_bug.cgi?id=12235

--- Comment #21 from Carl Smith <carl.input@gmail.com> ---
(In reply to Aryeh Gregor from comment #20)
> (In reply to Carl Smith from comment #17)
> > The output must be converted to HTML, which involves preserving all
> > whitespace, including tabs (think `ls -la`).
> > 
> > Converting every space to &nbsp; and every new line to <br> and then
> > converting tabs into HTML tables, doesn't actually cover all the edge cases,
> > and it takes ages, and roughly doubles the size of the output.
> 
> You want to escape only < and &, as &lt; and &amp; respectively, and wrap in
> <pre>.  This should only increase the size of the output slightly, unless
> you have an extremely large number of < or &.

That doesn't handle tabs, and there's other problems with it.

> (What does "it takes ages" mean?)

It takes a long time on the server to actually do all the cgi escapes, and
`output = output.replace( '<', '&lt;')` stuff. Tabulated strings also have to
be converted to HTML tables. It's just a lot of work that isn't needed in a
time critical part of the code. When you hit enter in a shell, you expect
output with no delay. This can't take tenths of seconds without making things
feel crappy.

> > output = '<xmp>'+output+'</xmp>'; // works perfectly
> 
> Only until your output happens to contain the string "</xmp>" (or any
> equivalent).  Then it will break.  If your application accepts untrusted
> input, moreover, you've created a very easily exploitable XSS vulnerability.

This doesn't apply to the application I'm working on, but it's probably best to
just try and look at general cases. XSS is just an ever present concern.
Removing <xmp> doesn't make the Web more secure, so restoring it doesn't make
it less so.

> > It's been pointed out that there are ways to hack the same effect by
> > combining a bunch of other tags, but is that really what we want in HTML5?
> 
> Yes, this is the normal way to do things in web programming.  <xmp> doesn't
> really help much, because as soon as "</xmp>" occurs your solution breaks
> and you have to fall back to <pre> and escaping anyway.  <xmp> is mostly
> only useful for hand-authoring.

The likelihood of </xmp> occurring in output is pretty minimal, and I'd rather
handle that edge case than deal with a number of things, some much more awkward
than that, constantly.

Dirty hacks have been the normal way to do web programming for years, but
that's not the way forward. We'll end up with a 3rd party <x-xmp> tag before
too long, or more than one.

Please reconsider.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Sunday, 13 October 2013 13:02:59 UTC