XProc Minutes 6 Sep 2007 from Norman Walsh on 2007-09-06 (public-xml-processing-model-wg@w3.org from September 2007)

From: Norman Walsh <ndw@nwalsh.com>
Date: Thu, 06 Sep 2007 12:36:53 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <m28x7jn42i.fsf@nwalsh.com>
See http://www.w3.org/XML/XProc/2007/09/06-minutes

W3C[1]

                                   - DRAFT -

                            XML Processing Model WG

Meeting 83, 6 Sep 2007

   Agenda[2]

   See also: IRC log[3]

Attendees

   Present
           Norm, Mohamed, Paul, Henry, Alessandro, Rui, Richard, Michael,
           Andrew, Alex, Murray

   Regrets

   Chair
           Norm

   Scribe
           Norm

Contents

     * Topics
         1. Accept this agenda?
         2. Accept minutes from the previous meeting?
         3. Next meeting: telcon 13 September 2007
         4. Comments on the new draft
         5. Namespace fixup?
         6. Rename p:equal to p:compare?
         7. Semantics of p:label-elements
     * Summary of Action Items

     ----------------------------------------------------------------------

  Accept this agenda?

   -> http://www.w3.org/XML/XProc/2007/09/06-agenda

   Accepted.

  Accept minutes from the previous meeting?

   -> http://www.w3.org/XML/XProc/2007/08/30-minutes

   Accepted.

  Next meeting: telcon 13 September 2007

   No regrets.

  Comments on the new draft

   -> http://www.w3.org/XML/XProc/docs/langspec.html

   None heard.

  Namespace fixup?

   Norm: There's been a thread on this.
   ... The two positions seem to be: only when you serialize vs. on every
   step.

   Michael: Am I right to assume that the folks who want it only on
   serialization are proposing a namespace fix-up step?
   ... Should we add a p:namespace-fixup step?

   Henry: It's not a deeply technical issue, it's a complex one where people
   are trying to figure out what users are going to find most useful.

   Alex: My proposal wasn't that we have namespace fixup but that we don't do
   wrong things. Don't create the problems that require namespace fixup.
   ... I think the most extreme position is that we do namespace fixup all
   the time.
   ... A more moderate position is to say that the steps in our library don't
   mangle namespaces.
   ... Then there's the "steps can do anything they want" position.

   Henry: It would be possible to do option 1, but that would require a lot
   of analysis. My feeling is that that would be a substantial job of work.
   ... Everytime I think I've understood a reasonable subset of what those
   things are, I've come up with more.

   Alex: I'm on the side of saying that we need to make our step library not
   cause these problems as much as possible.

   Henry: I think because it's irritating but true that there's no published
   description of what namespace fixup means, that we don't have something we
   can refer to.
   ... Nonetheless, in various toolsets and libraries, there are
   serialization libraries that do some subset of necessary fixup.
   ... It's the fact that those are there and the analysis that we'd need are
   not that inclines me towards saying, this gets enforced at the margins.
   ... That's why my bias is in that direction.
   ... I also think it will make the spec easier to read.

   Alex: That means that if you insert an element, there's no requirement to
   copy the in-scope namespaces.
   ... I think we need to say something about that in at least some cases.
   ... Then half the battle is won.

   Henry: We don't say enough about what we mean when refer to nodes, which
   we do. The Schema Rec says what infoset properties are required for
   elements; we haven't said that.
   ... We haven't been at all clear about whether the prefix property is one
   we care about.
   ... In-scope namespaces, namespace attributes, we could go down the list.

   Alex: I agree we need to say something about that. I think users want
   prefixes to be preserved.

   Henry: Maybe this is a compromise: without being specific, implementations
   should preserve as much information as possible and produce complete
   infosets but enforcement is only at the margins. Mention things like
   prefixes and namespace attributes, etc.
   ... The reason we focus on local names and namespace names go without
   saying is because that's what's necessary for XPath expressions to work.
   ... The question of what about the rest of the document, it seems to me,
   it would be much simpler and allow us to get to L/C, to allow a general
   health and safety warning at the beginning.

   Richard: One possibility would be to define the micro components in terms
   of trivial XSLT stylesheets because then XSLT has already defined what you
   have to do with prefixes and how serialization is supposed to work and
   which things are not allowed.

   Alex: Has XSLT really said that much/

   Richard: XSLT has said that applications don't have to use the prefixes.
   ... It says that reading what's serialized should produce the same data
   model except for some obvious cases, like extra namespace nodes.

   Alex: XSLT 2.0 is in a nice situation because they got rid of this
   problem. You must copy the namespace declarations now.
   ... It's only XSLT 1.0 that has this problem, and our other steps.

   Richard: I was thinking of XSLT 1.
   ... It does give the rule that you have to get the same thing back.

   Henry: But it's model is quite impoverished compared to our own.

   Straw poll: Simple binary choice between saying that the spec should
   gaurantee trivially serializable documents between steps or not?

   scribe: Or should we enfoce that requirement only at serialization time.
   ... And that leaves open the question of how we do the former if we do it.
   We can just state it and leave it up to the implementation, or we can try
   to do all the analysis necessary.

   Alessandro: I'm curious because I can't picture what would be the
   difference between components doing the fixup or the serializer doing the
   fixup. Can't we just leave it all to the implementors?

   Richard: That will result in different implementations behaving
   differently.
   ... But maybe the only ones that will be different should be considered in
   error anyway.

   Henry: Is there anyone who doesn't think that we should garauntee our
   output is w/f XML?

   No. Whew. :-)

   Norm: Can you even *tell* if a step doesn't do fixup?

   Richard: Suppose that the pipeline generates a stylesheet, then the
   namespace bindings on those elements are going to be used. If you did
   fixup that put a namespace binding on one of those elements, then that
   could change the meaning of the XPath.

   Norm: Yeah, alright.

   Richard: But it seems to me that that's a bogus program anyway.
   ... Why was it doing that?

   Henry: What that points towards is something which says "it is
   implementation-dependent how much is detected by the processor with
   respect to that kind of issue but this is unlikely to cause significant
   interoperability problems unless you're doing something dodgy anyway"

   Murray: So I've been reading the email and listening and I'm not sure I
   even understand what XProc is about. Maybe a few simple questions will
   help.
   ... If I read in an XML document, there are NS bindings and uses.
   ... As I go through various steps, I may be adding and removing things.
   This could result in missing, new, or conflicting namespace bindings.

   Richard: Yes. But you dont' have to be doing anything particuarly bad to
   do this. Just add a wrapper around an element and that wrapper must have a
   namespace declaration for whatever prefix you use.
   ... And that might conflict with one you've already used.

   Straw poll: Should we put a health warning in the spec and ask for
   priority feedback, rather than trying to nail this ourselves now.

   Murray: The results should be not just well formed XML but faithful to the
   spirit of the author.

   Richard: The delete example is a good one.

   Norm: I think rename is the culprit here, not delete. Delete deletes the
   whole subtree.

   Michael: Unwrap rather than delete would give you the problem.

   Paul: My only concern with the health warning is that we're supposed to go
   to Last Call with no other issues.
   ... We need to make sure that we don't make it sound like an open issue.
   We need to say this is what we think the answer is and see if that
   satisfies people.

   Murray: In the GRDDL spec we put a health warning in about validation in
   our Last Call.

   <ht> Proposal: "Atomic steps which add, delete or change aspects of XML
   documents may introduce inconsistencies in the relationship between the
   namespace names of elements and attributes, namespace declarations and
   in-scope namespace bindings. The extent to which these inconsistencies are
   detected and repaired on a step by step basis is implementation-defined.
   Such inconsistencies *must* be repaired on serialization. . .

   <ht> (a process usually referred to as 'namespace fixup')

   Murray: Someone asked whether we expected the final serialization to be
   well-formed. I alwasy thought that the output of every step would be well
   formed.

   Henry: That's what we're struggling with.
   ... I should have included 'prefixes' above.

   Richard: But we aren't specifying how they must be repaired.
   ... What serialization is produced to do the repair?

   <MoZ> removing the document can be a repair

   Richard: Let's try a concrete example.
   ... Suppose unwrapping removes an element with a declaration, what happens
   to the children.

   Alex: I think we can point to the serialization spec which does have a
   nice description of this.
   ... There's something in there about reconstructed infosets.

   Richard: I believe that we have to have something that addresses this.
   ... If the element you removed in unwrap had text children, they really
   will lose that namespace.

   <MoZ> without adding too much complexity to the problem, I want to add the
   concern about the fact that with p:string-replace, I can replace a string
   with characters that are not allowed in XML 1.0 but are in XML 1.1 (&#1;)

   Henry: I'm perfectly happy in this regard to point to the serialization
   spec for guidance.

   Murray: Here's my thought. We put minimal text in the specification and we
   edit the description of the serialization step so that it spells this out
   in a little more detail and we point to a separate document to detail all
   of this.

   Norm: That won't work on process grounds

   <MoZ> Please not only namespace

   Henry: Straw poll: Ask the editor to add a health warning about namespaces
   with references to the serialziation spec and leave it at that.

   Richard: The effect with respect the email discussion si the question, is
   it ok to leave it until serialization?

   Henry: This health warning would encourage implementors to do their best
   step by step.

   Paul: I'm happy with that.

   Mohamed: I think it must also include warnings about XML 1.0 vs. XML 1.1.

   Alex: I don't like the health warning.

   Paul: I'd still like to go to last call, unless you think we still have an
   open issue.

   Micheal: This sounds like an open issue to me.
   ... I can't support that resolution.

   Norm: With my chair's hat on, I cannot in good conscience claim there
   isn't an issue here.

  Rename p:equal to p:compare?

   Accepted.

  Semantics of p:label-elements

   Norm: It's been suggested that we should use sequential numbers and not
   check for duplicates.

   Henry: If I add an xml:id and a subsequent step already has it, so I think
   the duplicate detection is a complete red herring and gets in the way of
   using this for scoped identifiers.

   Norm: Any object to removing duplicate detection?

   Accepted.

   Murray: Let's leave it implementation defined.

   Richard: I disagree strongly; the generate-id() in XSLT has that behavior
   and its a constant source of irritation.
   ... It should be defined exactly what the IDs are.

   Henry: Regression tests have the same problem.

   Alex: Sequential numbering is the suggestion? I'm ok.

   Norm: Any object to sequential numbering instead of
   implementation-defined?

   <MSM> what is the relevant total ordering here?

   Mohamed: Can we make it an option to make it random?

   Alex: We could add a radix?

   Murray: Why not just make it an option to support sequential numbering,
   but you can implement other schemes if you want.

   Norm: Alas, we're out of time, so I think we'll have to take this one to
   email as well.

   Adjourned.

Summary of Action Items

   [End of minutes]

     ----------------------------------------------------------------------

   [1] http://www.w3.org/
   [2] http://www.w3.org/XML/XProc/2007/09/06-agenda
   [3] http://www.w3.org/2007/09/06-xproc-irc
   [7] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
   [8] http://dev.w3.org/cvsweb/2002/scribe/

    Minutes formatted by David Booth's scribe.perl[7] version 1.128 (CVS
    log[8])
    $Date: 2007/09/06 16:34:34 $
Received on Thursday, 6 September 2007 16:37:05 UTC