Towards a default processing model for XML

1. Extracts from XProc minutes

Richard: Use cases would be a good place to start. I've long imagined that one such use case is to answer the question "what does a web browser do with an XML document"
Richard: In XML Core yesterday, when we were talking about when xml:id processing occurs, that's the sort of thing that I thought this model might help us describe.
Paul: So things like when XInclude processing occurs by default.
Paul: So, the default processing model would define some default processing that you do on a document and you end up with an infoset and that infoset is special. It's the more official or default infoset. And because that's the more official one, that's the one that establishes "meaning".
Henry: The relatively neutral term that the TAG uses for this is the elaborated infoset.
Henry: Murray Maloney raised an objection to GRDDL going forward because it didn't answer the question of whether it operated on the pre-XIncluded document or the post-XIncluded document.
Henry: One way to think about this is that defining the elaborated infoset would allow specs to say, other things being equal, start here.
PG: Surprised to see you mention XInclude, XML Sig, XML Encryption, but not xml:id and xml:base. HST: You're right, an oversight, consider them added.
Henry: For a long time I have wanted to include decryption and signature checking, because I think the world would be a better place if use of the XML security technologies was much more widespread. But I've finally given up: of necessity, decryption and signature verification involve out-of-band appeal to key files and passphrases. Without those, the data just isn't secure. And you may need more than one set of them for a given document. This just doesn't fit well with a notion of default processing model which is pervasive, simple, and often unattended. Good news: Without these, since Xinclude is itself recursively specified, we don't have to implement fixed-point detection for the DXPM in XProc. So maybe we write an XProc pipeline which implemented the DXPM.
Straw man: An XProc pipeline consisting of an XInclude step(modulo some uncertainties wrt xml:id)
There's a chicken and egg problem. Imagine two stages: we publish an DXPM spec; we publish a new edition of XInclude which references the new DXPM spec. We won't get everything we want until the second step [because XInclude as written doesn't support xml:id]
[Would 3023-bis solve this problem for us?]
PG: Do we have to worry about schemas? HT: Yes, because of the way we wrote XPointer wrt IDness.
HT: External entities. PG: They get expanded, don't they? HT: Not by all the browsers [there's a thread running wrt HTML5 on this very topic]
HT: Open question: What about the flexibility in the XML spec itself? Do we want to require the 'full' well-formedness parse?