Re: Round-tripping ixml? from Norm Tovey-Walsh on 2022-10-18 (public-ixml@w3.org from October 2022)

From: Norm Tovey-Walsh <norm@saxonica.com>
Date: Tue, 18 Oct 2022 09:24:59 +0100
To: Michal Měchura <michmech@lexiconista.com>
Cc: public-ixml@w3.org
Message-ID: <m2a65ty1dr.fsf@saxonica.com>

> A newbie here with a newbie question. Is there an ixml processor
> anywhere that supports round-tripping? That is, not only parsing from
> “not XML” into XML, but also generating/linearizing/flattening from
> XML into “not XML”.

Not that I’m aware of, though several folks on the CG have expressed an
interest, I believe.

> I don’t seem to be able to find anything like that anywhere. All the
> implementations seem to be parsers, and only parsers. Even the ixml
> specification talks only about parsing, no mention of the other
> direction anywhere. This surprises me.

In the fully general case, the problem is intractable.

Grammars can lose information. Consider:

S = 'a', -'.'
  | 'a', -'?'
  | 'a', -'!' .

Given <S>a</S>, it’s impossible to know what the input was.

That’s a toy example, but the same kind thing happens in “real”
grammars. In the iXML grammar for iXML, for example, a string doesn’t
preserve whether it was delimited by single or double quotes.

I still think it’s an interesting problem, and my guess is that for a
lot of grammars you could do a plausible job for a lot of inputs. But I
haven’t had time to think seriously about implementing it.

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Tuesday, 18 October 2022 08:36:33 UTC