- From: Arnaud Desitter <arnaud02@users.sourceforge.net>
- Date: Tue, 22 Jan 2008 13:56:18 +0000
- To: "John Snelson" <john.snelson@oracle.com>
- Cc: html-tidy@w3.org
On 22/01/2008, John Snelson <john.snelson@oracle.com> wrote: > Arnaud Desitter wrote: > > On 22/01/2008, John Snelson <john.snelson@oracle.com> wrote: > >> Is there a better way to do what I want? I would be quite happy to > >> implement a new API method to do this if that's required - does anyone > >> else think this would be useful? > > > > Please refer to http://tidy.sf.net/issue/1636028. > > Your contribution to a new API would be welcome. Please post it using the > > tidy patch tracker. > > Thanks for the pointer. From the bug report linked, it's not obvious > what the correct way to fix this is. Should I change tidyNodeGetText() > to return the unescaped value of the node, or should I add a new method? >From the bug reports, please add a new function. > > Here's what I propose - I'll add a new method: > > Bool tidyNodeGetValue( TidyDoc tdoc, TidyNode tnod, TidyBuffer* buf ); > > For attribute, text, comment, and processing instruction nodes this > method will fill the buffer with the value of the node. The value will > be unescaped, and not serialized (no "<!--" or "<?" etc.). > > Some questions: > > 1) Are there other node types the method should work for? > 2) Should I respect the specified output encoding, or use UTF-8? (For > instance, the tidyNodeGetName() function always returns UTF-8) Could you add that to include/tidy.h please ? > 3) What should I do about unrepresentable characters? IMO, UTF8 is a good choice. Bjorn or others may comment. Because it is a new function, there is no backward compatibility issue so it can be modified until it feels right. Regards, > > John > > -- > John Snelson, Oracle Corporation http://snelson.org.uk/john > Berkeley DB XML: http://www.oracle.com/database/berkeley-db/xml > XQilla: http://xqilla.sourceforge.net >
Received on Tuesday, 22 January 2008 13:58:07 UTC