W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2010

[whatwg] base64 entities

From: And Clover <and-py@doxdesk.com>
Date: Fri, 27 Aug 2010 00:28:59 +0200
Message-ID: <4C76EAAB.9090207@doxdesk.com>
On 08/26/2010 10:56 PM, Aryeh Gregor wrote:

> I don't know of any general-purpose way to have
> "</string>" in a string literal (or anywhere else),

The simple approach is to use JavaScript string literal escapes: 
`"\x3C/script>"`.

A JSON encoder may offer the option to avoid HTML-special characters in 
string literals, encoded as escapes like `\u003C`. This allows literals 
to be included in a JavaScript block that may or may not be in a CDATA 
element, so may or may not need HTML-encoding.

> other than splitting it up like "</scr" + "ipt>".

This is a common but wrong idiom that should be avoided; it won't 
validate because in HTML4 the `</` sequence itself (ETAGO) ends a script 
block.

> elmt.innerHTML = 'Hi there<?php echo htmlspecialchars($name) ?>.';

Is a common error (security hole).

Encoding text for use in a JavaScript string literal (`\`-escaping) is 
an entirely different proposition to encoding text for use in HTML 
(entity/character references).

PHP offers no JS-string-literal-escape function. `addslashes` is very 
close, but won't handle some cases with non-ASCII characters correctly. 
Better to use `json_encode` to transfer the string, then write as text:

     elmt.textContent = <?php echo json_encode('Hi there, '+$name, 
JSON_HEX_TAG); ?>

(assuming innerText or Text Node backup for IE/older browsers.)

A 'magic' escaping feature that will somehow guess what sort of encoding 
the author means is wishful (impossible) thinking. A base64-encoded 
entity reference could do nothing for JavaScript, CSS or other nested 
string context.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/
Received on Thursday, 26 August 2010 15:28:59 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:26 UTC