XML Processing use cases/scenarios

Hi,

Here are some use cases/scenarios on xml processing, to be discussed
next Thursday.

Until then,


Rui

----------------------------


Use Case: Document Production Framework (DPF)
---------------------------------------------

Processing sequence:

1. An initial non-XML document is fed to the DPF;
2. A special purpose tagging task is applied, generating
    an well-structured XML document (i.e. sectioning);
3. Table of contents is extracted from the document; anchors
    and links are created to specify the document navigation;
4. Pagination is performed, splitting the document
    structure through some configuration (e.g., 1 sub-section
    per page, 3 paragraphs per page, etc.), updating the ToC
    at each pagination step;
5. Each page is transformed into some output language (e.g,
    XHTML), with the ToC re-integrated.


            non-XML document
                    |
                (tagging)
                    |
                    |   toc_extract.xsl
                    |          |
                    |          V
            XML document -> (XSLT) -> table of contents
                    |                        ^    |
                    V                        |    |
  paginate.xsl -> (XSLT) <-------------------     |
                    |                             |
                    V                             |
            page 1 ... page n                     |
                    |                             |
                    V                             |
                (for-each)                        |
                    |                             |
                    V                             |
doc2xhtml.xsl -> (XSLT) <------------------------
                    |
                    V
               XHTML page



Open issues:

1. Pagination "garbage": main output from XSLT pagination is
    left in the wild, all pages are created through
    xsl:result-document directives
2. How to specify arbitrary outputs (e.g., pagination), regexps?
2. Integration of non-XML sources (<generate /> in Cocoon)
3. Secondary input documents referal (document() function
    in XSLT vs. XInclude pre-processing)
4. Application specific tasks (e.g. tagging)





Scenario: Pipeline Reuse and Composition
----------------------------------------

Given a pipeline, how to reuse it without specifying it
over and over? Having this feature, pipelines can be
seen as high-level composition blocks. This opacity can
leverage service-oriented pipelines.


Open issues:

1. Pipeline referencing:
     - different files referenced explicitly?
     - XInclude pre-processing?
2. Expected behaviour between pipelines (i.e., input-output
    interaction)



Use Case: Non-Linear Pipelines
------------------------------

          initial.xml
              |
              V
ss1.xsl -> (XSLT) ---> b.xml
              |           |
              V           |
            c.xml         |
              |           |
              V           |
ss2.xsl -> (XSLT)        |
              |           |
              V           |
            d.xml         |
              |           |
              V           |
ss3.xsl -> (XSLT) <------
              |
              V
          final.xml


Open issues:

1. How to specify non-linearity?
     - Pipeline composition?
     - Implicit processor interaction?

-- 
Rui Lopes <rlopes@di.fc.ul.pt>                          Work: +351 21 750 05 32
Junior Researcher                                       Fax:  +351 21 750 05 33
quoted-printable
Faculdade de Ci=C3=AAncias - Universidade de Lisboa
HCIM/LaSIGE - Department of Informatics
Campo Grande, Bloco C6 - Sala 6.3.29
Lisboa, Portugal
--------------070703060704040200000905--

Received on Wednesday, 21 December 2005 10:14:33 UTC