W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > March 2006

RE: XML Processing Model DSDL Use Case

From: Martin Bryan <martin.bryan@csw.co.uk>
Date: Sun, 19 Mar 2006 11:13:23 +0000
Message-ID: <989E1989F3130042879B902D17584216EF0291@msxmigration.cswgroup.local>
To: "Norman Walsh" <ndw@nwalsh.com>
Cc: <dsdl-discuss@dsdl.org>, <public-xml-processing-model-wg@w3.org>




Norm

Thanks for the comments. Some responses to your points

>I'm not sure (speaking personally) that I think the DSDL use case is
one that we could expect to solve using only the "core component" of the
pipeline language. I think the initial box in your picture "Use NVDL to
separate namespace streams" may have to be a custom component of some
sort. And that's the component that can stitch things back together
again.

NVDL is definitely a "custom component" in W3C terms, but its one that
creates multiple PSVI sets as input to the pipeline that somehow need to
be recombined in an agreed order. What we have yet to work out is how
best to do this. I'd like to see ordering of multiple streams on the WG
agenda.

>I guess the part I still find confusing is the way the arrow that comes
out of "Transform MathML to SVG" connects to the arrow that comes out of
"Check SVG subset with Schematron". I think I can imagine how arrows and
boxes are hooked together, but what's the significance of the
"arrow-to-arrow" join?

The join is just there to highlight that there are fewer output types
(two) produced by the (four) different types of input. This is a
"requirement" - you can't presume that each input type will produce its
own output type. Note that for text we also have a similar situation
(which isn't shown by joining arrows) as we pass through the HTML with
only some minor checks on the meta data but we convert docbook elments
to HTML as well.

>Another problem I have with your diagram is the box labelled "Validate
character sets" on the right hand side. I had thought we were in an
XML-to-XML world and therefore character sets were no longer relevant.
Am I missing something there too?

This again is a DSDL-specific thing. We provide a language (CRVL) that
allows you to control which subsets of UTF are valid for a given element
type (so you can, for example, restrict elements with particular
xml:lang values, to characters that are valid for that language). We
need to invoke this custome process between the parse and the
transformation. 

>Heh. I think I can see how to do individual "peephole" steps, but the
idea of building a dynamic set of peepholes (nested within each other in
effect, I think) of unbounded cardinality strikes me as a bit ambitious.

It is! DSDL is nothing if not ambitious :-)

Peepholing seems about right, if we can work out how to nest them in the
longer term. This is why you need a component, such as NVDL, that allows
different namespaces to be processed separately.

>You may find it difficult to convince the XProc WG that this use case
should reasonably be addressed by the V1.0 processing language using
only core components. :-)

I expect that. What I am trying to highlight at this stage is that to
meet all pipelining requirements you need to go further than considering
a single stream processing model.

Martin
Received on Monday, 20 March 2006 10:28:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:47 GMT