- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Thu, 06 Sep 2007 15:20:42 +0100
- To: Norman Walsh <ndw@nwalsh.com>
- Cc: public-xml-processing-model-wg@w3.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Norman Walsh writes:
> HST wrote:
> | Yes, but it still won't necessarily serialise without work, and it's
> | possible that serialising will introduce failure to round-trip.
> | Suppose the matrix has an ns-attribute for the default namespace, but
> | the included bit consists entirely of no-namespace elts. The
> | serialised result will be borked. To detect this, you have to look at
> | every node in the inserted tree.
>
> I suppose the default namespace *is* a special case. But I don't think
> that's a problem.
>
> Here's a document:
>
> <rootelem xmlns="rootns">
> <div xmlns="xhtml">
> <target/>
> </div>
> </rootelem>
>
> Suppose I want to replace target with some subtree. When do I ever have
> to look at the subtree's descendants?
>
> To insert
>
> <x:otherroot xmlns:x="xxxns">
> <nons/>
> </x:otherroot>
>
> I simply make sure that if there's a default namespace where 'target'
> appears, I undeclare it. Everything else "just works". No?
Yes.
Now consider this case:
<p:rename match="my:foo" new-name="foo" xmlns:my="http://www.example.com/ns"/>
when the imput is
<foo xmlns="http://www.example.com/ns">
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
<baz>...</baz>
</foo>
Fixup in this case will have to not only rename the element from {},
but also remove the xmlns [namespace attribute] from the <foo> elt and
push it down on to _all_ the <baz> elements.
Namespace fixup is _full_ of these silly fiddly messy corner cases,
and I think we will not be thanked by implementors if we make them do
it at every step. I particularly _don't_ want to get into the
business of trying to specify in detail what checks and fixes each
step which _might_ mess things up must do. I think putting the
requirement on serialization, at the margins, is going to be much
simpler to state, understand and implement.
> |> Allowing un-fixed-up markup to flow between steps lets it get deeply
> |> burried in documents through operations that wouldn't normally cause
> |> fixup to be necessary.
> |
> | I don't understand.
>
> My point is just that there are a few steps that allow namespaces to
> get out of wack. If we don't mandate that namespaces are fixed up *on
> those steps*, then *every* step can produce documents that have broken
> namespaces. That just seems awful.
As you point out, the crucial bits will _never_ get screwed up. That
is, the [local name]s and [namespace name]s of elements and attributes
themselves. That means, right there, that we've covered the 99%
case. Getting the [namespace attribute], [in-scope namespaces] and
[prefix] properties right is on the one hand _much_ harder, and on the
other _much_ less important, until and unless you get to serialization.
> |> On a separate, but related, topic, I'm confused about how the SAX
> |> argument plays out. Why is it hard to do this fixup with SAX? When do
> |> you ever have to buffer more than one start element event?
> |
> | SAX filters just pass along what you give them. If we require NS fixup
> | between steps, everyone using a SAX substrate will have to put an NS
> | fixup filter _every_ pair of steps, won't they?
>
> I don't think so. It just means that, *in steps where namespaces can
> get broken*, *the step* will have to make sure that it doesn't output
> broken elements. But it'll never have to buffer more than one start
> tag to do that, I think.
See above example.
ht
- --
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFG4Ay6kjnJixAXWBoRAqGyAJ0TjHuL4Y4VHarc2sA902dZ8nDGHwCfcrhg
ytP/brZLEph5LztLlPWziNg=
=sdm4
-----END PGP SIGNATURE-----
Received on Thursday, 6 September 2007 14:21:16 UTC