- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 15 Nov 2007 14:33:46 +0000
- To: "Alex Milowski" <alex@milowski.org>
- Cc: "XProc WG" <public-xml-processing-model-wg@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alex Milowski writes: > [about circular imports] We discussed this at the f2f, but that discussion doesn't seem to have made it into the minutes. I'll try to reconstruct the salient points. 1) Not only circularity, but re-entrancy, is relevant. That is, not just A imports B imports . . . A, but also A imports B and C, B and C both import D. 2) Although multiple passes are required, the number is finite, i.e. each pass is for a distinct kind of processing, and each pass is linear in the total number of documents involved. Details below. > I propose we add the following to handle this: > > * In section 5.10 p:import, we need to define the resolved > location based on the 'href' attribute. The 'href' attribute must > be resolved against the base URI of the p:import element. The > resulting URI is the "resolved location" of the import. Hmm, not quite -- I think the "resolved location" is whatever your URI library comes back with as the location, so that redirects and conneg are allowed for. > * When loading a pipeline, a processor must keep track of the > resolved location of imports (e.g. a simple stack of URI values) and > detect circular imports by literal comparison of URI values as > strings. If two URI values differ by encoding, the processor must > treat them as different imports. A stack isn't enough, that won't catch the re-entrant case. > * When a circular import is detected, the processor must provide the > same step declarations from the original, earlier import. > > What this means is that in the case of a circular import, a > processor must determine the step declarations provided by a > pipeline or library before completely loading the importing > construct. The good news is a it suffices to provide references to > the correct step declaration (e.g. one from the earlier import) and > then, once all parts are loaded successfully, the complete step can > be determined. So I think we agreed that this has to be expanded and changed a bit, along the following lines: 1) Do a 'minimal load' of every pipeline and pipeline library document, by processing all p:imports, recording their resolved locations, and declining to load anything twice; 2) Determine the type names *defined for export* by each top-level scope == top-level pipeline or pipeline library == entry in the resolved location table and every, as follows (see [1]): a) The pipeline name itself, if the scope is a pipeline; b) The step types and pipelines defined therein, if the scope is a library. 3) Associate with every pipeline and library the resolved location of all its imports, and from any of those which are libraries, the resolved locations of all _their_ top-level imports, recursively. That is, the transitive closure of p:import, _not_ descending into p:pipeline. 4) Determine the type names visible within each library, as follows: a) The step types and pipelines defined locally within the library; c) The types _defined for export_ by all the resolved locations identified for the library in (3) above. 5) Determine the type names visible (in addition to the builtins) in each pipeline, as follows (see [1]): a) The pipeline name itself; b) The step types defined locally within the pipeline; c) The types _defined for export_ by all the resolved locations identified for the pipeline in (3) above; d) The type names visible within its enclosing library, if it's in one. This does more work than strictly necessary, and is liable to optimisation in the absence of circularity, but it's simpler to state w/o optimisation. On balance, I think this is too detailed to fit well in the spec. itself. I think what we need in the spec., pbly in section 3.2, is [1] and something like this: Circular and re-entrant chains of p:import are not errors, and implementations MUST take the necessary steps, based on testing string equality of the actual URIs involved (that is, after absolutization and redirection, if any), to avoid infinite loops or spurious duplicate-definition errors. Appendix XXX provides high-level description of one algorithm which does this. and then put the above 5-step story in (non-normative) Appendix XXX. ht [1] http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2007Oct/0049.html - -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFHPFjKkjnJixAXWBoRAojlAJ9RVzBVQTS/ZYT7aCTjQrgYCgJepQCfeGKP 9UOi3A2O395hTGylZIBewGc= =Bv5G -----END PGP SIGNATURE-----
Received on Thursday, 15 November 2007 14:33:58 UTC