[whatwg] On separation of code and data

In this email I sketch my personal view on the future of HTML, I do not
claim I have invented a solution to all security issues on the web; I merely

hope that this email will be the starting point of a real discussion on web-

security instead of browser security. Since this is my first email to any of
these mailinglists, please correct me if I sent this to the wrong one.

The current design of the integration of JS and HTML is fundamentally
One of the major issues which should be learned from a long history of
overflows etc is that one should never mix up data and code.
The consequences of this design flaw show itself by the XSS-exploits
on a daily basis. In my view the browser/standards developers can do 2
  - They can ignore it and state that it is a problem of the web-developers
  - They can fix it in a similar manner the "No Exec" is implemented in
 CPU architectures.

The first approach will never work: The average webdeveloper simply cannot
oversee possible XSS exploits, this method was attempted for numerous years
in the buffer overflow world... IMHOI the average webdeveloper is less
skilled then the average aplication developer.

The second solution might seem an utopia, but I think this is realizable. I
hope in 2/5 years browsers will have a special "Secure Mode" which ensures
webdesigners that their website is not vulnerable to XSS in these browsers.

The first step in implementing this "No Exec" strategie is that there needs
to be a clear distinction between JS and HTML. This can be achieved quite
 by preventing any JS operation inside a HTML/CSS file. All stuff related to

events and code executions should be in JS files.

Thus instead of creating

 <a href=# onclick="DoFunction()" id=123 >

we write
 <a href=#  id=123 >


All JS functions in HTML files should not be executed, the JS-code should be
rendered visually

The next step is that we make a clear distinction in our DOM tree which code
 executable and which is not. We might even consider creating 2 trees, one
 the executable code in it (retrieved from JS files), the other one with the

data (HTML, CSS files). None of the objects in the data tree should ever be

At this moment I cannot oversee the exact consequences
and limitations on the communications between these 2 trees.

Most XSS exploits might be prevented by this design, but there are still
debatable issues:
   - What if JS uses eval on a data block?
     * Prohibit the use of eval?
   - What if the JS file contain some user generated content?
     e.g. a PHP script will generate the JS code and write a line like

     If I select the username <alert(hello)> then this script is still
     vulnerable to XSS

     * A solution might be that only specially-signed scripts can be run in
       "Secure mode"; One can get this signature for free, but has to wait
for 2
       hours. In practice this will prevent people from doing these nasty
       serverside things.

     The solution to this practical username example would be to include a
     special div <div id=username>$PHPusername</div> and then read the
     of the div in the JS file.

One might wonder how to integrate this into the currently available
websites. I
think a opt-in option would be sufficient; this might be taken into account
into HTML5,
but we have to give the web-developer some candy for doing the effort of
his website. The new tool we will give him is the cross domain
Since we can exclude all XSS attacks most of the fundamental problems with
cross site XHR are automatically solved.

Separating code and data will simplify the parsers for both filetypes and
improve render performance and maybe even decrease render bugs...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20070607/b123623e/attachment.htm>

Received on Wednesday, 6 June 2007 23:33:24 UTC