Re: non-XML characters (e.g. #1)

On Mon, 3 Jan 2022 at 16:31, Norm Tovey-Walsh <> wrote:

> > Then what are you saying above?
> > I provide C0 char in, "it doesn't end up in the output"
> > IMHO that is modifying my data as given to the application?
> But modifying data is what ixml *is for*.

I think we have a different interpretation of modifying?

> You write a grammar that translates some non-XML format into XML. Along
> the way, you decide what items in the non-XML format get turned into
> attributes, what items get turned into elements, what items get output
> as characters, and what items get omitted.

Agreed. But changing Fred into Jane (I think) is not part of the bargain?

OK, I'm being more crude than you, but I hope you can see my

> All Steven is saying is that if you write a grammar that accepts input
> that contains C0 control characters, you better make sure all the C0
> control charactesr get omitted if you’re going to make XML at the end of
> the day.

Which is at the heart of my objection.

> Consider this grammar for amounts of money in GBP (written on the fly
> and untested, YMMV):
> cost: "£"? digit+ ("." digit+)? .
> -digit: ["0"-"9"] .
> If you parse “£1234.56” with that grammar, you get
> <cost>£1234.56</cost>

No problem, you've not messed with my input data.

> Suppose for the sake of argument that “£” was not a valid XML character.
> Then that XML output would be invalid. And that would be because *you*
> wrote a grammar that generated something invalid!

Halt reset and load.
   I *want* this to be wrong | in error | prohibited?
i.e. in the 'should not happen' (or reported as an error etc)

> You could instead have written the grammar like this:
> cost: -"£"? digit+ ("." digit+)? .
> -digit: ["0"-"9"] .
> And then you’d get
> <cost>1234.56</cost>
> That logic applies for all characters (actually) not valid in XML.
> Does that help?

No Norm, because it seems you're accepting the spec, with non-XML
characters as 'good to go'. I'm saying it needs changing.
  If I'm interpreting Michaels comments correctly, it would make your job
easier too?


Dave Pawson
Docbook FAQ.

Received on Monday, 3 January 2022 16:41:00 UTC