- From: Bethan Tovey-Walsh <bytheway@linguacelta.com>
- Date: Thu, 13 Feb 2025 17:53:43 +0000
- To: David Birnbaum <djbpitt@gmail.com>
- Cc: ixml <public-ixml@w3.org>
Might it be worth asking whether Gunther would be willing to add the option to produce escaped characters instead of CDATA? I don't think there's any iXML-internal way of fixing this, because even if you use insertions to replace reserved characters with their entity references, presumably MarkupBlitz would still interpret the inserted string "&" as a string starting with a reserved character, and stick it in a CDATA. The only thing you could do would be fudge it, insert some placeholder instead of the reserved character, and post-process to replace the placeholder with an entity reference. In which case, you might just as well save yourself the trouble, and post-process the original CDATA with XSLT vel sim. BTW *** Dr. Bethan Tovey-Walsh linguacelta.com <http://linguacelta.com/> Golygydd | Editor http://geirfan.cymru <http://geirfan.cymru/> Croeso i chi ysgrifennu ataf yn y Gymraeg. > On 13 Feb 2025, at 16:38, David Birnbaum <djbpitt@gmail.com> wrote: > > Thanks, Fredrik, and John, for the quick responses. Getting rid of the CDATA marked section (in favor of &) downstream isn't a problem, but I was wondering whether it was possible within ixml, and I understand why ixml might reasonably consider that type of control out of scope. Perhaps a candidate for a pragma, should an ixml processor opt to put that decision under user control? > > On Thu, Feb 13, 2025 at 11:33 AM John Lumley <john@saxonica.com> wrote: > My processor (https://johnlumley.github.io/jwiXML.xhtml) uses fn:serialize() in SaxonJS as the serializer of the XML parse result, so > S: ~[]. > with & as input, produces > <S>&</S> > > John Lumley > Sent from my iPad > >> On 13 Feb 2025, at 15:57, David Birnbaum <djbpitt@gmail.com> wrote: >> >> Dear public-ixml, >> >> Is there an ixml idiom for ingesting reserved characters (ampersand, angle brackets) and replacing them with XML entities? When I parse a plain-text input document that contains an ampersand using Markup Blitz or xmq, the output element creates a CDATA marked section for the entire content, so that, for example, when: >> >> "Wynken, Blynken & Nod" >> >> matches the production for a <title> element, it emerges as >> >> <title><![CDATA["Wynken, Blynken & Nod"]]></title> >> >> What I'd prefer is: >> >> <title>"Wynken, Blynken & Nod"</title> >> >> Thanks in advance for any advice! >> >> Sincerely, >> >> David >>
Received on Thursday, 13 February 2025 17:54:06 UTC