Towards a default processing model for XML
1. Extracts from XProc minutes
- Richard: Use cases would be a
good place to start. I've long imagined that one such use case
is to answer the question "what does a web browser do with an
XML document"
- Richard: In XML Core yesterday,
when we were talking about when xml:id processing occurs,
that's the sort of thing that I thought this model might help
us describe.
- Paul: So things like when
XInclude processing occurs by default.
- Paul: So, the default processing
model would define some default processing that you do on a
document and you end up with an infoset and that infoset is
special. It's the more official or default infoset. And because
that's the more official one, that's the one that establishes
"meaning".
- Henry: The relatively neutral
term that the TAG uses for this is the elaborated
infoset.
- Henry: Murray Maloney raised an objection to GRDDL going forward
because it didn't answer the question of whether it operated on
the pre-XIncluded document or the post-XIncluded
document.
- Henry: One way to think about this is that defining the elaborated
infoset would allow specs to say, other things being equal,
start here.
- PG: Surprised to see you mention
XInclude, XML Sig, XML Encryption, but not xml:id and
xml:base. HST: You're right, an oversight, consider them added.
- Henry: For a long time I have wanted to include decryption and
signature checking, because I think the
world would be a better place if use of the XML security technologies was much
more widespread. But I've finally given up: of necessity, decryption and
signature verification involve out-of-band appeal to key files and passphrases.
Without those, the data just isn't secure. And you may need more than one set
of them for a given document. This just doesn't fit well with a notion of
default processing model which is pervasive, simple, and often unattended. Good news: Without these, since Xinclude is itself recursively specified,
we don't have to implement fixed-point detection for the DXPM in
XProc. So maybe we write an XProc pipeline which
implemented the DXPM.
- Straw man: An XProc pipeline consisting of an XInclude
step(modulo some uncertainties wrt xml:id)
- There's a chicken and egg problem. Imagine two stages: we publish an DXPM spec; we publish a
new edition of XInclude which references the new DXPM
spec. We won't get everything we want until the second step [because
XInclude as written doesn't support xml:id]
- [Would 3023-bis solve this problem for us?]
- PG: Do we have to worry about
schemas? HT: Yes, because of the way we
wrote XPointer wrt IDness.
- HT: External entities. PG: They get expanded, don't
they? HT: Not by all the browsers [there's a thread running wrt HTML5 on
this very topic]
- HT: Open question: What about the flexibility in the XML
spec itself? Do we want to require the 'full' well-formedness
parse?