[whatwg] Proposal: Add HTMLElement.innerText

On Sat, 14 Aug 2010 20:03:30 -0400, Mike Wilcox <mike at mikewilcox.net>  
wrote:

> Wow, I was just thinking of proposing this myself a few days ago.
>
> In addition to Adam's comments, there is no standard, stable way of  
> *getting* the text from a series of nodes. textContent returns  
> everything, including tabs, white space, and even script content.

Well, you can do stuff like this:

------
(function() {
     function trim(s) {
         return s.replace(/^\s\s*/, '').replace(/\s\s*$/, '');
     }
     function setInnerText(v) {
         this.textContent = v;
     }
     function getInnerText() {
         var iter = this.ownerDocument.createNodeIterator(this,
         NodeFilter.SHOW_TEXT, null, null);
         var ret = "";
         var first = true;
         for (var node; (node = iter.nextNode()); ) {
             var fixed = trim(node.nodeValue.replace(/\r|\n|\t/g, ""));
             if (fixed.length > 0) {
                 if (!first) {
                     ret += " ";
                 }
                 ret += fixed;
                 first = false;
             }
         }
         return ret;
     }
     HTMLElement.prototype.__defineGetter__('myInnerText', getInnerText);
     HTMLElement.prototype.__defineSetter__('myInnerText', setInnerText);
})();
------

and adjust how you handle spaces and build the string etc. as you see fit.  
Then, it's just alert(el.myInnerText).

NodeIterator's standard. __defineGetter/Setter__ is de-facto standard (and  
you have Object.defineProperty as standard for those that support it). How  
newlines and tabs and spaces are stripped/normalized just isn't  
standardized in this case. But that might different depending on the  
application.

Or, just run a regex on textContent.

-- 
Michael

Received on Sunday, 15 August 2010 05:41:58 UTC