- From: David Birnbaum <djbpitt@gmail.com>
- Date: Thu, 13 Feb 2025 17:23:06 -0500
- To: Gunther Rademacher <grd@gmx.de>
- Cc: public-ixml@w3.org
- Message-ID: <CAP4v81qY-PZpcxHyOBCWE4QABrF=JLpGCRx2Z-KU0Q7O5hVF4g@mail.gmail.com>
Thanks, Gunther; this is very helpful. I just rechecked the ixml step in the pipeline and both Markup Blitz and xmq do, indeed, escape the ampersand with an & character entity, and not with a CDATA marked section. Now to figure out where the CDATA came from! On Thu, Feb 13, 2025 at 5:13 PM Gunther Rademacher <grd@gmx.de> wrote: > David, > > the result serialization of Markup Blitz never creates a CDATA section, > here is the code: > > > https://github.com/GuntherRademacher/markup-blitz/blob/9dbba6ecff48d489449dd04d87beda1ac98cbf35/src/main/java/de/bottlecaps/markup/blitz/Parser.java#L360-L402 > <https://deref-gmx.net/mail/client/JpgV9TtOqQI/dereferrer/?redirectUrl=https%3A%2F%2Fgithub.com%2FGuntherRademacher%2Fmarkup-blitz%2Fblob%2F9dbba6ecff48d489449dd04d87beda1ac98cbf35%2Fsrc%2Fmain%2Fjava%2Fde%2Fbottlecaps%2Fmarkup%2Fblitz%2FParser.java%23L360-L402> > > So running this command: > > java -jar markup-blitz.jar '!title : ~[]*.' '!"Wynken, Blynken & Nod"' > > creates this output: > > <?xml version="1.0" encoding="utf-8"?><title>"Wynken, Blynken & > Nod"</title> > > Could it be that some tool further processed the serialized result before > you got to see it? > > Best regards, > Gunther > > *Gesendet: *Donnerstag, 13. Februar 2025 um 16:57 > *Von: *"David Birnbaum" <djbpitt@gmail.com> > *An: *ixml <public-ixml@w3.org> > *Betreff: *Reserved characters in input? > Dear public-ixml, > > Is there an ixml idiom for ingesting reserved characters (ampersand, angle > brackets) and replacing them with XML entities? When I parse a plain-text > input document that contains an ampersand using Markup Blitz or xmq, the > output element creates a CDATA marked section for the entire content, so > that, for example, when: > > "Wynken, Blynken & Nod" > > matches the production for a <title> element, it emerges as > > <title><![CDATA["Wynken, Blynken & Nod"]]></title> > > What I'd prefer is: > > <title>"Wynken, Blynken & Nod"</title> > > Thanks in advance for any advice! > > Sincerely, > > David > >
Received on Thursday, 13 February 2025 22:23:22 UTC