W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > June 2007

Base URI processing

From: Jeni Tennison <jeni@jenitennison.com>
Date: Mon, 11 Jun 2007 15:17:32 +0100
Message-ID: <466D597C.5090701@jenitennison.com>
To: public-xml-processing-model-wg <public-xml-processing-model-wg@w3.org>

My current project requires a pipeline like this:

<p:pipeline>
   <p:input port="source" sequence="yes" />
   <p:for-each>
     <p:xslt1>
       <p:input port="stylesheet">
         <p:document href="WordML-to-XHTML.xsl" />
       </p:input>
     </p:xslt1>
     <p:xslt1>
       <p:input port="stylesheet">
         <p:document href="relative-CSS.xsl" />
       </p:input>
     </p:xslt1>
     <p:store>
       <p:option name="href" select="???" />
       <p:option name="method" value="xhtml" />
     </p:store>
   </p:for-each>
</p:pipeline>

What I want to do is use the base URI of the source document to 
determine the base URI of the result document; in this case, I want them 
to be saved in the same directory, with the same filename but with a 
'.htm' extension rather than a '.xml' extension.

I think that we need to say what the base URI of the result of each step 
is. For some of them, we might want to provide an option that tells the 
step what the base URI of the results should be. For example, the 
<p:xslt1> step could have a 'output-base-uri' option, just like the 
<p:xslt2> step does.

At the XProc level, we need to say what the base URI of the document 
generated by a <p:viewport> is. Is it the same as the base URI of the 
source document?

This doesn't help when I don't know what the base URI of the document is 
in the first place. I think we should have a step that annotates a 
document with suitable xml:base attributes:

<p:declare-step type="p:add-xml-base-attributes">
   <p:input port="source" />
   <p:output port="result" />
   <p:option name="all-elements" value="no" />
</p:declare-step>

The result of this step is a document that's identical to the source 
document, except that it adds xml:base attributes to elements. If 
'all-elements' is 'no' (the default), it only adds xml:base to the 
document element and to every element whose base URI is different from 
that of its parent. If 'all-elements' is 'yes', it adds xml:base to all 
the elements in the document.

With this step, I can do:

<p:pipeline>
   <p:input port="source" sequence="yes" />
   <p:for-each>
     <p:add-xml-base-attributes name="add-xml-base" />
     <p:xslt1>
       <p:input port="stylesheet">
         <p:document href="WordML-to-XHTML.xsl" />
       </p:input>
     </p:xslt1>
     <p:xslt1>
       <p:input port="stylesheet">
         <p:document href="relative-CSS.xsl" />
       </p:input>
     </p:xslt1>
     <p:store>
       <p:option name="href"
         select="concat(substring-before(/*/@xml:base, '.xml'), '.htm')">
         <p:pipe step="add-xml-base" source="result" />
       </p:option>
       <p:option name="method" value="xhtml" />
     </p:store>
   </p:for-each>
</p:pipeline>


Another step that I remember Henry mentioning at the F2F many moons ago 
was a 'make-uris-absolute' step like this:

<p:declare-step type="p:make-uris-absolute">
   <p:input port="source" />
   <p:output port="result" />
   <p:option name="match" required="yes" />
</p:declare-step>

The result of this step is a document that's identical to the source 
document, except for the attributes and elements matched by the match 
pattern. Any attribute or element matched by the match pattern is 
interpreted as a URI, and is turned into an absolute URI based on the 
base URI of its parent element (if the matched node is an attribute) or 
the element itself (if the matched node is an element). If the matched 
node is an element, the string value of the element is used as the 
relative URI, and the only content of the copy of the element will be a 
single text node child with the value of the absolute URI.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Monday, 11 June 2007 14:17:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:53 GMT