Re: General remarks on Annex A (very big mail, sorry for that) from Innovimax SARL on 2007-06-25 (public-xml-processing-model-wg@w3.org from June 2007)

From: Innovimax SARL <innovimax@gmail.com>
Date: Mon, 25 Jun 2007 21:06:34 +0200
To: "Norman Walsh" <ndw@nwalsh.com>
Cc: public-xml-processing-model-wg@w3.org
Message-ID: <546c6c1c0706251206m4b894d15xf35e62863a2f47d3@mail.gmail.com>
Comments inline

On 6/25/07, Norman Walsh <ndw@nwalsh.com> wrote:
> / Innovimax SARL <innovimax@gmail.com> was heard to say:
> | General remarks on the spec related to this part
>
> Thanks, Mohamed! Comments inline. (Including some questions for the
> WG, so read on, everyone :-)
>
> | The spec is still speaking (and I think it is a good point) of
> | definition of default input port and default output port
> |
> | [[
> | Each step may have a default output port. [Definition: If a step has
> | exactly one output port, or if one of its output ports is explicitly
> | designated as the default, then that output port is the default output
> | port of the step.] If a step has more than one output port and none is
> | explicitly designated the default, then the default output port of
> | that step is undefined.
> | ]]
> |
> | So i'm proposing to be clear for step with multiple input or output
>
> What needs to be clarified?

For each step that have more than one input or more than one output to
say which one is the default if any


>
> | General remarks on the component
> |
> | Please remove sequence="no" everywhere since it is confusing
>
> Done.
>
> | A.1.1 Count
> |
> | Can we remove the [Proposed] status ?
>
> Done.
>
> | A.1.2 Delete
> |
> | I'm ok with match semantics here
>
> Yes, clearly either match or select work for delete. My only concern
> is that delete, insert, and rename all use *the same* option.

In that case, I propose to have select for all the three (even if it
doesn't make much sense to have select for delete)


>
> | A.1.3 Equal
> |
> | I'm relunctant to see direct refereance to XPath 2.0. I would feel
> | better it was just a copy/paste of what we need
>
> Yuck! That means we have to track errata, etc. Duplication of
> information is bad :-)

It will only be for V1 hopefully
I don't want people to argue against XProc because they need to
implement XPath 2.0, and that would be the case, if we reference XPath
2.0 normatively

>
> | I'm not sure it would be simple to compare to sequence of document (if
> | someone has clear idea on how to do that with current components...)
>
> We don't allow a sequence of documents on either of the p:equal step's
> ports. So, looking at the text of fn:deep-equal, it seems to me that
>
>   This function assesses whether two sequences are deep-equal to each
>   other. To be deep-equal, they must contain items that are pairwise
>   deep-equal; and for two items to be deep-equal, they must either be
>   atomic values that compare equal, or nodes of the same kind, with
>   the same name, whose children are deep-equal.
>
> works just fine. In our case, the two sequences that will be compared
> each contain exactly one document item. It follows that they are deep
> equal if and only if their children are deep-equal. Comparing a
> sequence of elements is a requirement for the function, so it all
> "just works".
>
> Am I overlooking something?

Please give an example of such comparaison in XProc by using the
*non-sequence* version of p:equal ?

Say I have two sequence and I want to compare them
I fear that such thing will ask you more thant 10 lines of code...

>
> | Do we want  <p:input port="source"/> to be the default ? or no default ?
>
> Our spec currently allows any (and all) input ports to be defaulted. I
> don't think we want to special case that or p:equal. (Or any of the
> other components.)

I think that this is a chicken and eggs problem. That's fine to allow
me to default anything I want, but what will be the default behavior ?

>
> | A 1.4 Error
> |
> | Here is the definition of the error content
> | E.2 err:error
> |
> | Each specific error is represented by an err:error element:
> |
> | <err:error
> |  name? = NCName
> |  type? = QName
> |  code? = QName
> |  href? = anyURI
> |  line? = integer
> |  column? = integer
> |  offset? = integer>
> |   (anyElement*)
> | </err:error>
> |
> | We should provide all the attribute to be in synch with
> |
> | Since we allow error to have any content, I propose to be able to give
> | a content input with such content if necessary
> |
> | It is not clear what the rules are for such limitation ?
> |
> | How is the name computed ?
>
> I'm sorry, I don't understand your questions. All of the attributes are
> optional and you can put any content you want inside err:error. I guess
> that should really be (anyElement|text)*

I think err:error is fine. My problem is more p:error that is not able
to create as rich err:errors



>
> | A.1.5 Escape Markup
> |
> | Wasn't there a time where a match pattern was proposed ?
> | If not, I wouldn't mind since we should be able to do that through a
> | p:viewport step
>
> Alex?
>
> | A.1.7 Insert
> |
> | I still don't know why we limit ourselves to a match pattern here ?
> | May be at this point the match pattern **allow** nested visiting ? If
> | so, please make it clearer
> | I would mind for this one, since we don't have recursion in XProc
>
> What do folks think?


>
> | A.1.8 Label Elements
> |
> | I think that it was discussed to allow a select attribute here to say
> | which elements are concerned. Please add it
>
> Ok.
>
> | I would also mind for this one, since we cannot be sure with the use
> | of viewport to have an xml:id-valid document when concatenating it
> | back
> |
> | Trough reading. Apart from p:label that should enforce xml:id rules
> | (NCName and unique), we don't have any component that is xml:id aware,
> | do we ?
>
> No, I don't think so. But we don't have any other steps that care about
> IDs either.
>
> | A.1.9 Load
> |
> | Please rename validate to dtd-validate (to avoid confusion)
>
> That seems sort of clumsy to me. Anyone else have an opinions?

Or may be validate-dtd

Another point. If I don't have a <!DOCTYPE inside the document, I
won't be able to use DTD. Am I missing something ?

>
> | It is not clear wether an error will be throw if external document are
> | not accessible
>
> I believe a validating parser must fail if it can't load the external
> subset.
>
> | I think we should add xmlid-validate too
>
> Hmm. Opinions?
>
> | A.1.10 Namespace Rename
> |
> | It is clear that @from is a *list* of namespace
>
> I think that's a mistake. I think it should be a single namespace.
>
> | But it is not clear for @to : is it also a list (à la translate) ? or
> | a single destination ?
>
> Which resolves that question
>
> | what if we have empty list in @from ?
>
> and allows an empty 'from' to mean the 'null' namespace.
>
> | what if we have empty string in @to ?
>
> I think that deletes the namespace.
>
> | what if we use http://www.w3.org/XML/1998/namespace in @from ? in @to ?
>
> that should be an error, as should the XMLNS namespace.
>
> | What kind of error such a component can throw ?
>
> I think it can only fail if the XML or XMLNS namespace is specified.
>
> | A.1.11 Parameters
> |
> | No comment for the moment (I wait for stabilisation of parameter proposal)
>
> Heh.
>
> | A.1.12 Rename
> |
> | The first sentence contradict the second
> |
> | 1st : "The rename step renames elements or attributes in a document"
> | 2nd : "Each element, attribute, or processing-instruction matched by
> | the match pattern"
> |
> | Same as match in Insert
> | Can we make available the current matched node to the evaluation of
> | @name option (as in String Replace) ?
>
> The value of the name attribute is a string in this case, not an
> expression, so I don't think that makes sense. We could make the value
> an expression, but what would the use case be?
>
> | A.1.13 Replace
> |
> | I have a strong use case, where I want to replace a PI with the
> | content of a document
> | Can we allow matching of PI, too ? (and even comment() ?)
>
> Yes, I think so.
>
> | A.1.14 Set Attributes
> |
> | Tricky case : what about
> |
> | <p:set-attributes match="*">
> |  <p:input port="attributes">
> |    <p:inline>
> |      <root xml:id="fixed" xmlns:toto="http://mynamespace" />
> |    </p:inline>
> | </p:input>
> | </p:set-attributes>
>
> I think that produces an xml:id invalid document. I don't think the
> toto namespace comes into play.

So if I put a p:label just after, this step will fail ?

>
> | A.1.15 Split Sequence
> |
> | Do we want <p:output port="matched" sequence="yes"/> to be the default
> | or no default output ?
>
> Yes, I think the matched port should be the default output.
>
> | A.1.16 String Replace
> |
> | The first sentence is confusing
> | "The String Replace step matches a set of nodes in the document
> | provided on the source input port and replaces them with a new
> | generated string"
> |
> | please replace by something like
> | "The String Replace step matches a set of nodes in the document
> | provided on the source input port and replace each matched node with a
> | new generated string"
>
> I think I improved that.
>
> | A.1.17 Store
> |
> | This sentence is not clear
> | "The output of this step is a document containing a single c:result
> | element whose href attribute contains the same value as the href
> | option."
> |
> | I'm still unclear by this one and hope to have a clear idea of
> | serialisation stuff before last call
>
> We must have a clear understanding of the serialization stuff before
> last call.
>
> | A.1.18 Unescape Markup
> |
> | "The Unescape Markup step takes the text value of the document element
> | and parses the content as if it was and unicode character stream
> | containing XML"
> | s/and unicode/a unicode/
> | "This is the reverse of the serialize step." --> "This is the reverse
> | of the Escape Markup step"
>
> Fixed.
>
> | A.1.19 Unwrap
> | Please make clear that this match pattern works with nested matches
>
> Is that what we want?
>
> | What is the expect result of
> | <p:unwrap match="*/*" />
>
> Depends on how we answer the preceding question :-)
>
> I think I had in mind that in all cases were we use a match pattern,
> we don't recurse into the matched content. But I could be confused :-)

I think that that point should be raised to see if we stick to
match implies non recursion
select implies recursion

(I'm not sure that will resist to analyse)

>
> | A.1.20 Wrap
> |
> | "The is processed by" --> "The document is processed by"
> |
> | We should put the discussion back on telcon for allowing group-adjacent
>
> Right.
>
> | A.1.21 Wrap Sequence
> |
> | The text is very confusing
> | [[
> | The Wrap Sequence step converts the sequence of documents on the
> | source port into a single document and produces a single document on
> | the result port. The document produced has a new document element
> | whose name is specified via the wrapper option and whose children are
> | the documents in the order recieved on source port. Each document
> | received is converted into a sequence of children by taking the
> | children of the document info item.
> | ]]
> | Is something like this more in line with the proposed behavior ?
> | [[
> | The Wrap Sequence step converts the sequence of documents on the
> | source port into a single document which is produced on the result
> | port. The document produced has a new document element whose name is
> | specified via the wrapper option and whose children are the documents
> | in the order recieved on source port.
> | ]]
>
> I think I improved the text.
>
> | Having say that
> |
> | What happen to PIs, spaces and comments outside the document node ?
> |
> | We should also consider having a group-by option on this one
>
> That seems like overkill to me.

Please don't forget that for-each will output sequence, and if you
want to group them recursively, it will be easier to do that at the
wrap sequence level

Having said that, I ask to be persuaded to abandon that idea, if it
seems easy to implement with current tools

>
> | A.1.22 XInclude
> |
> | I think we never solved the p:map proposal for this one ?
>
> Right. I'll put it back on the agenda, but I've given up.

We also never make clear the role of catalog here

>
> | A.1.23 XSLT
> |
> | I think having a verb "transform" instead of a name
> | "source/result/alternate/insertion/replacement/attributes/matched/notmatched"
> | is very confusing
> |
> | I prefer to have a name and "stylesheet" would fit perfectly
>
> Yes! Done.

Oups the text still reference "transform"
[[input port named 'transform']]


>
> | A.2.1 HTTP Request
> |
> | Norm's proposal seem to lead to consensus, please use it instead
>
> Alex has done the lion's share of the work on this one, I'm going to
> leave it to him to edit.


Great !
>
> | A.2.2 Relax NG Validate
> |
> | Do we want <p:input port="source"/> to be the default or no default ?
> | Is there really no options ?
>
> I think we do need an option for the DTD compatibility annotations.
>
> | A.2.3 XML Schema Validate
> |
> | Please detail the expected behavior of the use of option (I'm unclear
> | with assert-valid and the allowed value for mode)
>
> XSD schema experts, help, please.
>
> | A.2.4 XSLT 2.0
> |
> | I prefer to have a name (instead of a verb) and "stylesheet" would fit perfectly
>
> Yes!

Oups the text still reference "transform"
[[input port named 'transform']]

>
> | Do we want <p:output port="result"/> to be the default output or no
> | default output ?
>
> I think we want the result output to be the default.
>
> | A.2.6 XQuery 1.0
> |
> | Are we sure to not have any troubles with such a transformation of the query ?
>
> I dunno.
>

-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 8 72 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 €
Received on Monday, 25 June 2007 19:06:37 UTC