- From: <keshlam@us.ibm.com>
- Date: Wed, 14 Jun 2000 14:22:42 -0400
- To: Dan Connolly <connolly@w3.org>
- cc: xml-uri@w3.org
Pure brainstorming on how the Namespace decision might impact the DOM: DOM Level 2 currenly assumes what amounts to the Literal interpretation -- the namespace name is just stored and retrieved as a string. Forbid would not require any redesign, since absolute URIRefs can also be stored and compared literally. We _could_ add a syntax-check to make sure that attempts to use relative syntax were caught and rejected, or we could leave that to the caller, or we could leave that as a quality-of-implementation issue. Absolutize opens a few can of worms and seems to require some real redesign work. Note that this all has storage/computation implications which may impact the suitability of the DOM as a model for some tasks. 1) What are we absolutizing in terms of? DOM L2 doesn't guarantee that the base URI can be retrieved. If you have Entity Reference nodes, you can search upward until you find one; if you hit the top of tree, you can then assume the base URI of the Document. But folks insisted that the DOM allow "flattening" of Entity References; if that's done, these wrapper nodes are discarded and there's no good place to hang the context-change information. This also runs into the annoying question of what the base URI is of an entirely synthetic document and how to absolutize in that case. 2) Who's absolutizing? Arguably, since the absolute URI is "the real namespace identity" in this scenario, nobody should be asking the DOM to deal with anything else. Among other things, asking us to recheck a name that's been previously checked is a waste of cycles. On the other hand, if someone went to the trouble of specifying a relative name they might expect to be able to use it even though it _ISN'T_ the real name. 3) What's stored? For round-tripping purposes, a namespaced node would have to know both the absolute URI (because that would be the "real" namespace identity that lookups and attribute-conflict detection would want to use) and the "as typed" form (because when we serialize the DOM to XML syntax, we can _NOT_ lose the fact that it was relative, since that would nail down an interpretation that the document's author explicitly decided to leave fuzzy). Nodes probably do NOT want to carry more pointers than they have to, and we don't want to risk these strings getting out of synch (last thing you want is to let someone claim that http://foo/bar/baz has the serialized form "../somethingElse"!)... which seems to suggest that namespace declarations now become objects in their own right, carrying both strings. That means an additional layer of indirection each time you want to check the namespace, either in your own code or inside the DOM. And it seems to be more complexity than the Infoset is tracking.... 4) Inconsistant serialization can result. What happens if you have a node whose as-written namespace is "..\foo", and DOM editing moves/copies it to a portion of the tree where a different base URI is in effect, followed by writing this out as XML syntax? As far as I can tell, the answer is that we output the relative form and simply accept the fact that when the document is read back in the node will be in a different namespace -- just as if the whole document had been relocated in the meantime. That's ugly but apparently inherent in relative namespaces. Undefined opens a somewhat different can of worms: It isn't clear that a definitiion could be installed later without breaking significant numbers of documents/code when an interpretation does become agreed upon... and we've just seen how much pushback there can be over a behavioral change. ______________________________________ Joe Kesselman / IBM Research
Received on Wednesday, 14 June 2000 14:23:27 UTC