- From: Norm Tovey-Walsh <norm@saxonica.com>
- Date: Thu, 06 Jan 2022 09:58:38 +0000
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Cc: ixml <public-ixml@w3.org>
Received on Thursday, 6 January 2022 10:05:15 UTC
> since multiple raw parse trees may turn into the same XML. And > since it’s not easy or cheap, detecting ambiguity maybe needs to be > downgraded to a SHOULD or MAY. Assuming that we took the position “if it has multiple parses, it’s ambigous”, I was thinking about the problem of parses that produce the same XML. I was imagining an option on my implementation for “show me all the parses” vs an option for “show me all the different XML parses”. I got 54 different parses out of one of my first test cases (perhaps erroneously given my continuing frustrations with the attempt to use someone else’s parser), but I’m reasonably sure that there’s only one XML result. Constructing 54 trees and doing deep-equal on them (for some definition of “deep and “equal”) seems like it might be expensive. Then I wondered, if you just walked over each tree constructing a cryptographic hash from the names of start tags and the character output, I think identical hashes would be indicative of identical XML output and would probably be cheaper than materializing them all and comparing them. But I was washing dishes at the time, so it doesn’t count as careful consideration, just an idea. Be seeing you, norm -- Norm Tovey-Walsh Saxonica
Received on Thursday, 6 January 2022 10:05:15 UTC