W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2010

[whatwg] Proposal: Add HTMLElement.innerText

From: Michael A. Puls II <shadow2531@gmail.com>
Date: Sun, 15 Aug 2010 11:37:13 -0400
Message-ID: <op.vhhkcbg51ejg13@sandra-svwliu01>
On Sun, 15 Aug 2010 11:17:43 -0400, Mike Wilcox <mike at mikewilcox.net>  
wrote:

> Michael, good try, but I've been down that road; it's pretty hard to do.  
> You left in the script text,

Yeh, forgot about that. I'm grabbing text nodes from anything.

> spaces were missing, and there were no line breaks.

Yes, I did that on purpose because I thought that's what you wanted  
judging by "textContent returns everything, including tabs, white space.."

But, either way, it is indeed more complicated than my example.

> On Aug 15, 2010, at 7:41 AM, Michael A. Puls II wrote:
>
>> On Sat, 14 Aug 2010 20:03:30 -0400, Mike Wilcox <mike at mikewilcox.net>  
>> wrote:
>>
>>> Wow, I was just thinking of proposing this myself a few days ago.
>>>
>>> In addition to Adam's comments, there is no standard, stable way of  
>>> *getting* the text from a series of nodes. textContent returns  
>>> everything, including tabs, white space, and even script content.
>>
>> Well, you can do stuff like this:
>>
>> ------
>> (function() {
>>    function trim(s) {
>>        return s.replace(/^\s\s*/, '').replace(/\s\s*$/, '');
>>    }
>>    function setInnerText(v) {
>>        this.textContent = v;
>>    }
>>    function getInnerText() {
>>        var iter = this.ownerDocument.createNodeIterator(this,
>>        NodeFilter.SHOW_TEXT, null, null);
>>        var ret = "";
>>        var first = true;
>>        for (var node; (node = iter.nextNode()); ) {
>>            var fixed = trim(node.nodeValue.replace(/\r|\n|\t/g, ""));
>>            if (fixed.length > 0) {
>>                if (!first) {
>>                    ret += " ";
>>                }
>>                ret += fixed;
>>                first = false;
>>            }
>>        }
>>        return ret;
>>    }
>>    HTMLElement.prototype.__defineGetter__('myInnerText', getInnerText);
>>    HTMLElement.prototype.__defineSetter__('myInnerText', setInnerText);
>> })();
>> ------
>>
>> and adjust how you handle spaces and build the string etc. as you see  
>> fit. Then, it's just alert(el.myInnerText).
>>
>> NodeIterator's standard. __defineGetter/Setter__ is de-facto standard  
>> (and you have Object.defineProperty as standard for those that support  
>> it). How newlines and tabs and spaces are stripped/normalized just  
>> isn't standardized in this case. But that might different depending on  
>> the application.
>>
>> Or, just run a regex on textContent.
>>
>> --
>> Michael
>

-- 
Michael
Received on Sunday, 15 August 2010 08:37:13 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:00 UTC