Re: simple shorthand syntax proposal from Jonas Sicking on 2009-02-06 (public-html@w3.org from February 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Fri, 6 Feb 2009 02:34:50 -0500
To: Robert J Burns <rob@robburns.com>
Cc: Alexey Feldgendler <alexeyf@opera.com>, public-html@w3.org
Message-ID: <63df84f0902052334p68760882la35ec6b3937e46d5@mail.gmail.com>

On Thu, Feb 5, 2009 at 4:37 PM, Robert J Burns <rob@robburns.com> wrote:
>
> Hi Håkon,
>
> On Feb 5, 2009, at 1:23 PM, Alexey Feldgendler wrote:
>
>>
>> On Thu, 05 Feb 2009 15:32:16 +0100, Håkon Wium Lie <howcome@opera.com>
>> wrote:
>>
>>> And to summarize that discussion: some people think the proposal goes
>>> against the requirement for backwards compatibility, other think it's
>>> too late to introduce it, and some think the syntax is so compelling
>>> that the costs could be worth it.
>>>
>>> I'm in the last category, but -- after reading the discussion -- it
>>> seems that consensus will be out of reach for now. Let's keep it in
>>> mind for HTML6.
>>
>> When time comes to develop HTML6, exactly the same argument will be valid,
>> and exactly the same considerations would lead to it being kept in mind for
>> HTML7. Therefore, the only two reasonable choices are now or never.
>
> Perhaps this could be something brought into the HTML5 parser. It could also
> be a separate parser, with a corresponding separate serialization algorithm,
> and a separate serialized form definition.
>
> For example the HTML parsing does not now prohibit "." or "#" from tag names
> (not that I think there is a legacy content support issue worth raising here
> though). So it is something that could be brought into the HTML5 parser
> without breaking legacy content, but it does require blocking such tag names
> (which seems fine to me). Its not backwards compatible only in the weak
> sense that current parsing isn't handled that way, so using it for actually
> deployed content would have to wait until all targeted UAs supported the
> syntax.

One concern would be that if something like:

<script#foo>doEvil();</script>

is interpreted by browsers as a script element by browsers it's quite
likely that this would introduce XSS vulnerabilities in many sites.
Many sites today use primitive HTML parsers combined with blacklists
of element names to filter out dangerous content from things like blog
comments and myspace pages. It's a very unfortunate practice, but
unfortunately also quite wide spread.

We could introduce some sort of black list of elements for which a #
or . syntax is not supported, but we'd have do to some research into
which elements sites are commonly trying to filter out.

In general I'm not convinced that the major change in HTML syntax is
worth the convenience for authors.

/ Jonas

Received on Friday, 6 February 2009 07:35:26 UTC