W3C home > Mailing lists > Public > whatwg@whatwg.org > June 2010

[whatwg] Allowing ">" in attribute values

From: Benjamin M. Schwartz <bmschwar@fas.harvard.edu>
Date: Fri, 25 Jun 2010 15:34:05 -0400
Message-ID: <4C2504AD.4090908@fas.harvard.edu>
On 06/25/2010 11:50 AM, Boris Zbarsky wrote:
> It seems like what you want here is for browsers to parse as they do
> now, but a particular subset of browser-accepted syntax to be enshrined
> so that when defining your restrictions over content you control you can
> just say "follow the spec" instead of "follow the spec and don't put '>'
> in attribute values", right?

That's more or less how I feel.  The spec places requirements on how "user
agents, data mining tools, and conformance checkers" must handle
non-conforming input, but there are many other things in the world that
process HTML.  In other applications, it may be acceptable to have
undefined behavior on non-conforming input, like in ISO C.

HTML5 has a very clear specification of conformance, and a validator is
widely available.  If I build a tool that guarantees correct behavior only
on conforming inputs, then users can easily check their documents for
conformance before using my tool.  If my tool has additional restrictions,
then I need to write my own validator, and answer a lot of questions.

I was inspired to suggest this restriction after using mod_layout for
Apache, which inserts a banner at the top of a page.  It works by doing a
wildcard search for "<body*>".  There are a number of obvious ways to
break this [1]; one of them is by having ">" in an attribute value.  I'm
sure there are many thousands of such programs around the world.

It sounds like most experts here would prefer to allow ">" in attribute
values in conforming documents, and that's fine.  I don't fully understand
the advantage, but I won't argue against consensus.

--Ben

[1] A javascript line like "width<bodywidth && height>bodyheight" would
also break it, as would an appropriately constructed comment.  It might be
possible to construct a regexp for this that functions correctly on all
conformant HTML5 documents.  Such a regexp would be considerably simpler
if ">" were disallowed in attribute values.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100625/3a349a5e/attachment.pgp>
Received on Friday, 25 June 2010 12:34:05 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:24 UTC