RE: Reading doctype-public and doctype-system from xsl:output

To validate against a DTD, you can always serialize/store the document using p:store into some temporary location and then use the p:load step with dtd-validate="true". Not the most elegant/efficient solution, but it should work:

 

...

<p:xslt>...</p:xslt>

<p:store href="foo.xml" doctype-public="..." doctype-system="..."/>

<p:load href="foo.xml" dtd-validate="true"/>

...

 

Regards,

Vojtech

 

--

Vojtech Toman

Principal Software Engineer

EMC Corporation

toman_vojtech@emc.com

http://developer.emc.com/xmltech

 

From: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] On Behalf Of Romain Deltour
Sent: Tuesday, June 15, 2010 10:39 AM
To: xproc-dev@w3.org
Subject: Re: Reading doctype-public and doctype-system from xsl:output

 

Thanks Philip. This is a good solution to access the doctype information. Then to validate against a DTD I guess you would need either to call an external program using p:exec or to implement a custom p:validate-with-dtd step.

 

As for the 'stylesheet' port of p:xslt, as far as I understand the  is not accessible because it is an input port (you cannot pipe it to other steps.

 

Romain.

Le 15 juin 10 à 09:54, Philip Fennell a écrit :





Romain wrote:

 

> Also, note that the xsl:output element in you XSLT defines *serialization* options,

> which means that the XML infoset flowing through your pipeline won't retain this information

 

This is very true, but, you can get access to the system and public parts of the Doctype declared for the serialization of the XSLT Transform by 'introducing' the Transform as a normal XML document and then you can reference any parts of it you like - see below.

 

However, one thing I've discovered with XProc (Calabash and Calumet implementations) is that the 'stylesheet' port of p:xslt is not accessible. In the example below I wanted to use:

 

<p:with-option name="attribute-value" select="/xsl:transform/xsl:output/@doctype-system">
  <p:pipe port="stylesheet" step="transform"/>
</p:with-option>

 

Which would have been clearer and more concise than the work-around I've had to use. Maybe in XProc 1.1 we could have access to such ports.

 

 

 

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step 
    xmlns:c="http://www.w3.org/ns/xproc-step"
    xmlns:p="http://www.w3.org/ns/xproc" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    exclude-inline-prefixes="c p xsl"
    name="test"
    version="1.0">
<p:input port="source">
  <p:inline>
    <doc>Hello world!</doc>
    </p:inline>

</p:input>
<p:output port="result"/>

<p:identity name="stylesheet">
  <p:input port="source">
    <p:inline>
      <xsl:transform 
          xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
          xmlns:xs="http://www.w3.org/2001/XMLSchema" 
          exclude-result-prefixes="xs"
          version="2.0">

        <xsl:output encoding="UTF-8" 
            doctype-public="-//NISO//DTD dtbook 2005-2//EN" 
            doctype-system="http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd"
            indent="yes" media-type="application/xml" 
            method="xml"/>

        <xsl:template match="/*">
          <xsl:copy-of select="."/>
        </xsl:template>
      </xsl:transform>
    </p:inline>
  </p:input>
</p:identity>

<p:xslt name="transform">
  <p:input port="source">
    <p:pipe port="source" step="test"/>
  </p:input>
  <p:input port="stylesheet">
    <p:pipe port="result" step="stylesheet"/>
  </p:input>
  <p:input port="parameters">
    <p:empty/>
  </p:input>
</p:xslt>

  <p:wrap-sequence wrapper="job-bag"/>

  <p:add-attribute match="/*" attribute-name="public">
    <p:with-option name="attribute-value" select="/xsl:transform/xsl:output/@doctype-public">
      <p:pipe port="result" step="stylesheet"/>
    </p:with-option>
  </p:add-attribute>

  <p:add-attribute match="/*" attribute-name="system">
    <p:with-option name="attribute-value" select="/xsl:transform/xsl:output/@doctype-system">
      <p:pipe port="result" step="stylesheet"/>
    </p:with-option>
  </p:add-attribute>
</p:declare-step>

 

 

The result of this pipeline is:

 

<job-bag

    public="-//NISO//DTD dtbook 2005-2//EN"

    system="http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd">

  <doc>Hello world!</doc>

</job-bag>

 

 

 

Regards

 

Philip Fennell

 

Consultant

 

Mark Logic Corporation

 

 

________________________________

From: xproc-dev-request@w3.org [xproc-dev-request@w3.org] On Behalf Of Romain Deltour [rdeltour@gmail.com]
Sent: 14 June 2010 21:57
To: xproc-dev@w3.org
Subject: Re: Reading doctype-public and doctype-system from xsl:output

Hi Jostein,

 

The p:validate-with-xml-schema step can only be used to validate against an XML Schema (see [1]). If you need to validate a document against a DTD you would have to use the p:load step (see [2]) with the @dtd-validate attribute. Note that if it will fail if the underlying XML parser is not a validating parser.

 

Also, note that the xsl:output element in you XSLT defines *serialization* options, which means that the XML infoset flowing through your pipeline won't retain this information (it doesn't matter if the XLST asks the output to be serialized with UTF8, or indented, etc).

 

Hope this helps,

Romain.

 

[1] http://www.w3.org/TR/xproc/#c.validate-with-xml-schema

[2] http://www.w3.org/TR/xproc/#c.load

 

Le 14 juin 10 à 16:03, Jostein Austvik Jacobsen a écrit :





Hi.

How can I use the DTD defined in the doctype, where the doctype is defined in a xsl:output in a p:xslt step, to validate and p:store the resulting XML? Here's some code to demonstrate my problem:

---- begin input.xml ----
<?xml version="1.0" encoding="windows-1252"?>
<!DOCTYPE document SYSTEM "http://www.idunn.no/dtd/document1.5_Idunn.dtd">
<document>
      <metaData>
            <title>Hello XProc!</title>
            <logicalTitle>ht-2009-4-1</logicalTitle>
            <description/>
            <language>no_NO</language>
            <commenting>off</commenting>
            <indexing>on</indexing>
            <subject/>
            <categoryPrimary/>
            <author>
                  <firstName/>
                  <lastName/>
            </author>
            <contentType>content_document_editorial</contentType>
            <colleague>false</colleague>
      </metaData>
      <contributors/>
      <contentSection>
            <para>Hello XProc!</para>
      </contentSection>
</document>
---- end input.xml ----


---- begin transform.xsl ----
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2..0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mf="http://example.com/2009/mf" exclude-result-prefixes="xs mf" xmlns="http://www.daisy.org/z3986/2005/dtbook/">
    
    <!--
        Here I specify the doctype, and I would assume that it carried through to the XProc script, however that seems not to be the case. 
    -->
    <xsl:output method="xml" version="1.0" doctype-public="-//NISO//DTD dtbook 2005-2//EN" doctype-system="http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd" encoding="utf-8"/>
    
    <xsl:template match="document">
        <dtbook xmlns="http://www.daisy.org/z3986/2005/dtbook/" version="2005-2" xml:lang="no">
            <head/>
            <book showin="blp">
                <xsl:apply-templates select="contentSection"/>
            </book>
        </dtbook>
    </xsl:template>
    <xsl:template match="contentSection">
        <bodymatter>
            <level1>
                <h1><xsl:value-of select="/document/metaData[1]/title[1]"></xsl:value-of></h1>
                <xsl:apply-templates/>
            </level1>
        </bodymatter>
    </xsl:template>
    <xsl:template match="para">
        <p><xsl:apply-templates/></p>
    </xsl:template>
</xsl:stylesheet>

---- end transform.xsl ----


---- begin pipe.xpl ----
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
    
    <p:input port="source" primary="true">
        <p:document href="input.xml"/>        
    </p:input>
    
    <p:xslt version="1.0">
        <p:input port="parameters">
            <p:empty/>
        </p:input>
        <p:input port="stylesheet">
            <p:document href="transform.xsl"></p:document>
        </p:input>
    </p:xslt>
    
    <!--  How would I validate the XML here? I would prefer to use the DTD specified
        in the output from the XSLT transformation instead of explicitly setting
        a URL to it: -->
    <!-- <p:validate-with-xml-schema/> -->
    
    <p:store href="dtbook.xml">
        <!--
            If I don't specify the doctype as options here, then the resulting
            document doesn't get a <!DOCTYPE ...>. I would like to pass through
            the doctype from the XSLT-transformation here.        
        -->
        <p:with-option name="encoding" select="'UTF-8'"/>
        <p:with-option name="doctype-public" select="'-//NISO//DTD dtbook 2005-2//EN'"/>
        <p:with-option name="doctype-system" select="'http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd'"/>
    </p:store>
    
</p:declare-step>
---- end pipe.xpl ----


---- begin dtbook.xml ----
<!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook 2005-2//EN" "http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd">
<dtbook xmlns="http://www.daisy.org/z3986/2005/dtbook/" version="2005-2" xml:lang="no">
  <head/>
  <book showin="blp">
    <bodymatter>
      <level1>
        <h1>Hello XProc!</h1>
        <p>Hello XProc!</p>
      </level1>
    </bodymatter>
  </book>
</dtbook>
---- end dtbook.xml ----


The pipe.xpl-script takes input.xml as input, applies transform.xsl on it and stores the result as dtbook.xml. The code, as shown, should be all valid and run without errors, resulting in dtbook.xml (see comments in the code.) I'm using oXygen/Calabash.


Best regards
Jostein Austvik Jacobsen

 

 

 

Received on Tuesday, 15 June 2010 08:53:51 UTC