- From: Alex Muir <alex.g.muir@gmail.com>
- Date: Wed, 23 Feb 2011 16:03:55 +0000
- To: XProc Dev <xproc-dev@w3.org>
- Message-ID: <AANLkTimo++DkSNrownJqDuKs7bte=yAt+WpedcCmwTtB@mail.gmail.com>
Hi, With the following w3m exec call, I get back wrapped lines with the newline ignored such that all the formatting that w3m adds in either the screen output or > out.txt is lost. <p:viewport match="/SECDocument/html/content/chunk/html"> <p:exec name="exexHTML2Text" command="/usr/bin/w3m" source-is-xml="false" result-is-xml="false" wrap-result-lines="true" /> </p:viewport> I tried adjusting the code below to work with a scanner rather than a buffered reader and well it seems there are no newlines in the text returned from the InputStream "is" as the following code results in one line rather than the many that exist in the document. if (wrapLines) { Scanner s = new Scanner(is); s.useDelimiter(System.getProperty("line.separator")); // s.useDelimiter("\\r\\n|\\n"); String line; while (s.hasNext()){ line = s.next(); if (showLines) { System.err.println(line); } tree.addStartElement(c_line); tree.startContent(); tree.addText(line); tree.addEndElement(); tree.addText("\n"); } So I'm wondering if p:exec isn't preserving newlines or just something I'm doing wrong, do I need some other option? Regards -- Alex ----- Currently: Freelance Software Engineer 6+ yrs exp <http://www.facebook.com/pages/Bafila/125611807494851> Previously: https://sites.google.com/a/utg.edu.gm/alex/ A Bafila, is two rivers flowing together as one: http://www.facebook.com/pages/Bafila/125611807494851
Received on Wednesday, 23 February 2011 16:04:38 UTC