Re: Status of Bugzilla Bug 10, Round-tripping various information in properties

On Dec 22, 2005, at 10:44 AM, Julian Reschke wrote:

> Wilfredo's mail didn't arrive here (yet).

   Sorry, I sent it from my apple.com account, and I haven't figured  
out how to subscribe to a W3C list without delivery (so it can post  
without moderation).

> - there's no W3C spec that would license an XML processor to throw  
> away prefixes

   I showed you an example of python code that parses XML and spits  
it right back out and the namespace prefixes were edits.  You said  
that wasn't a bug in PyXML.

   If PyXML is not a compliant XML library because it rewrites the  
prefixes, then I agree completely.  PyXML is broken and should be  
fixed, and we should move on.  But if that's not true, I don't think  
a server write should have switch to another library or do some  
haquery in order to preserve them.

   Geoff asserts that namespace preserving parsers are easy to come  
by.  That may be true for Geoff, and maybe he has a lot of  
flexibility in parser choices because he has no other such  
constraints.  I don't think that's true for everyone.

   Furthermore, in the case where a namespace is defined outside of  
the property container, some rewriting is required in order to  
maintain the association with the target URI.  So now we have  
language in 4.4 saying that changing the property container's prefix,  
unlike it's contents, are not significant, and suggest in 4.4.1  
suggesting that we should rewrite that tag to put all of the  
namespaces in there (seems like that should be explicitly stated in  
4.4 as a SHOULD as well, for consistency, unless we think tossing the  
namespace declarations out is OK, which is the opposite of what I'd  
expect).

   Aside from the mild weirdness of "it's significant, except for  
over here"... now it's not just good enough to use a parser that can  
preserves prefixes, but also I need it to let me specify that I can't  
edit them, "except for this in one element, where I need you to do  
the rewriting that you may otherwise have elsewhere."  What DOM  
methods exist for that sort of thing?

> - there are multiple W3C specs that use prefixes in attribute  
> values (or even text content), including XSLT and XML Schema

   And I'm all for servers that care about enabling the use of these  
specs bending over to accommodate them, but I don't think that other  
servers need to care.

   Additionally:

     http://www.w3.org/TR/REC-xml-names/

   Section 3: "Note that the prefix functions only as a placeholder  
for a namespace name." (Note "only" is emphasized in the text...)

   It's not unreasonable to think that having parsed that data, when  
rendering it back out, the placeholder might be relabeled without  
negative consequences.  Regardless of that assumption, that sentence  
makes it pretty clear than using the prefix for any other purposes is  
inconsistent with REC-xml-names.

> A property value serialized as #PCDATA (thus as escaped XML) is  
> something else than a property value serialized as XML. If you  
> control the format, such as when you define the property in a spec,  
> you sure have the freedom to say it's text, instead of XML. But  
> this requires that senders and recipients agree on that. But in  
> general, a client doesn't have that choice.

   If you define the property, senders and receivers always have to  
agree to honor your definition.  That's not unique here.  To me  
that's an argument for saying that such definitions SHOULD use #PCDATA.

   Geoff, I'm not following your argument that some clients might not  
de-serialize the data.  If you have XML there, you have structured  
data.  Either the client understands the property and does handles it  
properly or it doesn't.

> Furthermore, putting escaped XML into property values *will* have  
> negative effects on generic clients (which will display angle  
> brackets to the user) and some protocol extensions (such as DASL).

   What's a generic client going to show the user for structured XML  
data that it doesn't understand?  Generic client tends to refer to a  
tool for l33t haxx0rz.

   Perhaps I shouldn't get all bent out of shape over a SHOULD.

	-wsv

Received on Friday, 23 December 2005 02:50:55 UTC