Changes for Circular and Re-entrant Imports from Alex Milowski on 2008-01-24 (public-xml-processing-model-wg@w3.org from January 2008)

From: Alex Milowski <alex@milowski.org>
Date: Wed, 23 Jan 2008 16:01:02 -0800
To: "XProc WG" <public-xml-processing-model-wg@w3.org>
Message-ID: <28d56ece0801231601t1fda0640m8ca09092ea3b183b@mail.gmail.com>
Here are my proposed changes:

In section "5.10 p:import" , after the text starting with "If the
value is not recognized..."

   "Attempts to retrieve the library identified by the URI value may be
   redirected at the parser level (for example, in an entity resolver)
   or below (at the protocol level, for example, via an HTTP Location:
   header). In the absence of additional information outside the scope
   of this specification within the resource, the base URI of the library
   is always the URI of the actual resource returned. In other words, it
   is the URI of the resource retrieved after all redirection has occurred.

   As imports are processed, a processor may encounter new p:import
   elements whose library URI is the same as one it has already
   processed in some other context.  A circular import chain or re-entrant
   import is not an error and implementations must take the necessary
   steps to avoid infinite loops and/or incorrect notification of duplicate
   step definitions.  An example of such steps are listed in Appendix G.

   A library is considered the same library if the URI of the resource
   retrieved is the same.  If a pipeline or library author uses two different
   URI values that resolve to the same resource, they must be considered
   the same imported library.


Appendix G: Handling Circular and Re-entrant Library Imports

An implementation should be able to detect the following situations:

1. Circular Imports:  A imports B, B imports A.

2. Re-entrant Imports: A import B,C and C imports B.

To accomplish this, an implementation can use the following strategy:

1. For a pipeline or library, process all the p:import elements and
record the URI
    of each resource returned.  If the same resource URI is
encountered more than
    once, do not load and process that resource after the first time.

2. For each resource, determine the exported step types defined by the resource
    that are defined within the resource and not by an import by the
following rules:

    *  The pipeline name if the resource is a pipeline
    *  Any p:declare-step element within a library.

3. Associated with every resource URI a list of step types declared
within and a list of
   the "top level" imported resource URI values (i.e. the base URI of
the resolved
   resource for each p:import element in the library or pipeline).

   This is effectively two maps:

       D: {uri} -> {set of step declarations}
       I: {uri} -> {set of uris}

4. For any resource, the set of new types

    N: ({uri}, {set of uris}} -> ({set of step declarations},{set of uris})

    N(U,P) :=
       let R := D(U)
       for each k in I(U)
            if k not in P
                P:= union of { P, k }
                let r:= N(k,P)
                    R:= union of (R,r[0])
                    P:= r[1]
       return (R,P)

    If N(U,P) contains duplicates, then the same step name is declared
in different
    resources (which is an error).

5. For any resource, the set of types is:

    S: {uri} -> {set of step declarations}

    S(U) := N(U,{})[0]

    If S(U) contains any duplicates, then the same step name is
declared in different
    resources (which is an error).

An implement can use the maps N and S as necessary to resolve imports.

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 24 January 2008 00:01:18 UTC