W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > June 2008

ACTION: Mohamed to propose serialization changes to support C14N.

From: Innovimax SARL <innovimax@gmail.com>
Date: Sat, 7 Jun 2008 11:28:25 +0200
Message-ID: <546c6c1c0806070228t6a7e148bh84dc8f833755efd@mail.gmail.com>
To: "XProc WG" <public-xml-processing-model-wg@w3.org>

Dear

Here is what contains the 8 May Spec

[[

5.6 p:serialization

The p:serialization element allows the user to request serialization
properties on a p:pipeline output.

<p:serialization
  port = NCName
  byte-order-mark? = boolean
  cdata-section-elements? = NMTOKENS
  doctype-public? = string
  doctype-system? = string
  encoding? = string
  escape-uri-attributes? = boolean
  include-content-type? = boolean
  indent? = boolean
  media-type? = string
  method? = QName
  normalization-form? = NFC|NFD|NFKC|NFKD|fully-normalized|none|xs:NMTOKEN
  omit-xml-declaration? = boolean
  standalone? = true|false|omit
  undeclare-prefixes? = boolean
  version? = string />

If the pipeline processor serializes the output on the specified port,
it must use the serialization options specified. If the processor is
not serializing (if, for example, the pipeline has been called from
another pipeline), then the p:serialization must be ignored. The
processor may reject statically a pipeline that requests serialization
options that it cannot provide.

The default value of any unspecified serialization option is
implementation-defined.

The semantics of the attributes on a p:serialization are described in
Section 7.3, "Serialization Options".

It is a static error (err:XS0039) if the port specified on the
p:serialization is not the name of an output port on the pipeline in
which it appears or if more than one p:serialization element is
applied to the same port.

]]

and

[[
7.3 Serialization Options

Several steps in this step library require serialization options to
control the serialization of XML. These options are used to control
serialization as in the [Serialization] specification.

The following options may be present on steps that perform serialization:

    * byte-order-mark - The value of this option must be a boolean.
    * cdata-section-elements - The value of this option must be a list
of QNames. They are interpreted as elements name.
    * doctype-public - The value of this option must be a string. The
public identifier of the doctype.
    * doctype-system - The value of this option must be an anyURI. The
system identifier of the doctype. It need not be absolute, and is not
resolved.
    * encoding - A character set name.
    * escape-uri-attributes - The value of this option must be a boolean.
    * include-content-type - The value of this option must be a boolean.
    * indent - The value of this option must be a boolean.
    * media-type - The value of this option must be a string. It
specifies the media type (MIME content type).
    * method - The value of this option must be a QName. It specifies
the serialization method.
    * normalization-form - The value of this option must be an
NMTOKEN, one of the enumerated values NFC, NFD, NFKC, NFKD,
fully-normalized, none or an implementation-defined value.
    * omit-xml-declaration - The value of this option must be a boolean.
    * standalone - The value of this option must be an NMTOKEN, one of
the enumerated values true, false, or omit.
    * undeclare-prefixes - The value of this option must be a boolean.
    * version - The value of this option must be a string.

In order to be consistent with the rest of this specification, boolean
values for the serialization parameters use "true" and "false" where
the serialization specification uses "yes" and "no". No change in
semantics is implied by this different spelling.

The method option controls the serialization method used by this
component with standard values of 'html', 'xml', 'xhtml', and 'text'
but only the 'xml' value is required to be supported. The
interpretation of the remaining options are as specified in
[Serialization].

Implementations may support other method values but their results are
implementation-defined. It is a dynamic error (err:XC0001) if the
requested method is not supported.

A minimally conforming implementation must support the xml output
method with the following option values:

    * The version must support the value 1.0.
    * The encoding must support the values UTF-8.
    * The omit-xml-declaration must be supported. If the value is not
specified or has the value no, an XML declaration must be produced.

All other option values may be ignored for the xml output method.

If a processor chooses to implement an option for serialization, it
must conform to the semantics defined in the [Serialization]
specification.
Note

The use-character-maps parameter in [Serialization] specification has
not been provided in the standard serialization options provided by
this specification.
]]




My proposal is as follow (following closely http://www.w3.org/TR/xml-c14n11/ ) :

1) we add the option c14n - The value of this option must be an
NMTOKEN, one of the enumerated values 'false', 'with-comment',
'without-comment'

2) c14n can apply only for @version='1.0' (if  @c14n!='false' and
@version!='1.0' at the same time, then @@BEHAVIOUR@@)

3) c14n activated implies that @encoding='UTF-8' (if @c14n!='false'
and @encoding!='UTF-8' at the same time, then @@BEHAVIOUR@@)

4) By  4.1 No XML Declaration (in c14n11 spec), c14n implies that
@omit-xml-declaration='false' (if @c14n!='false' and
@omit-xml-delaration='true' at the same time, then @@BEHAVIOUR@@)

5) By 4.2 No Character Model Normalization, I'm unsure if this require
that @normalization-form='NFC' ; Henri any thoughts on this one ?



A) c14n activated MAY generate extra dynamic error (for example,
implementations of XML canonicalization MUST report an operation
failure on documents containing relative namespace URIs. XML
canonicalization MUST NOT be implemented with an XML parser that
converts relative URIs to absolute URIs) ; Caution this behaviour is
slithly different from the one described in 7.1.2 p:add-xml-base



For @@BEHAVIOUR@@, we have 2 choices :
1) IGNORE MODE : @@BEHAVIOUR@@ := 'then the value of the second
attribute is ignored'
2) STATIC ERROR MODE : @@BEHAVIOUR@@ := 'then it is a static error'

I'm mixed feeling about those  (but may be IGNORE MODE is less intrusive)

Mohamed
--
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 
Received on Saturday, 7 June 2008 09:29:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 7 June 2008 09:29:02 GMT