W3C home > Mailing lists > Public > xproc-dev@w3.org > May 2010

Re: removing/adding xmlns:xxx attributes

From: Wendell Piez <wapiez@mulberrytech.com>
Date: Tue, 04 May 2010 14:35:19 -0400
To: XProc-dev@w3.org
Message-ID: 20100504143557.GA12305@mail11e.verio-web.com>
Hi,

I can't speak to XProc and the tools being used here, but I can help 
clear up the business about the namespaces.

The thing to understand is that these aren't really attributes, and 
generally can't be added or removed in the same way attributes can.

Instead, they are namespace declarations, and namespace-conformant 
tools such as XProc processors are apt to add them as needed, when 
needed -- even if they are invalid to a DTD.

See, people aren't the only ones to see the syntax and think "oh, 
these are attributes". DTDs existed for years before namespaces were 
invented, and pretending the declarations are attributes are the only 
way to get DTDs to allow them.

But the proper way to deal with this is to deal with them as 
namespaces, not as attributes.

There are a couple of possibilities here. First, you could perhaps 
contrive the splitting script so that it would remove namespace nodes 
from elements where, according to the DTD, the namespace declaration 
"attribute" is not allowed. (Note that you actually have to remove 
them: they are already there, having been inherited from their 
ancestor; Calabash is not adding them.) In that case, you'd have a 
chance of getting DTD validation to work.

Second, if you removed the namespace declaration pseudo-attribute 
from the declaration for the original root element (in the DTD), it 
wouldn't be in scope on ancestors of law:extract where they aren't 
actually needed, and the splitter would not copy them. (You'd still 
need them to be in scope on the law:extract element itself, however, 
so leave that declaration in place.)

Another alternative is not to validate to a DTD at this point. For 
example, if you had an RNG or XSD variant of the DTD, expressing the 
same constraints, this would presumably work better, since both these 
are namespace-aware technologies.

Yet another possibility would be to run your instances through a 
filter before validating. An XSLT 2.0 transformation could remove the 
namespaces except where they are needed:

<xsl:template match="node()">
   <xsl:copy copy-namespaces="no">
     <xsl:copy-of select="@*"/>
     <xsl:apply-templates/>
   </xsl:copy>
</xsl:template>

Maybe XProc needs (or already has) a namespace cleanup filter?

Cheers,
Wendell

At 12:54 PM 5/4/2010, you wrote:
>Hi Tom,
>
>Namespace cleanup is hard, especially when DTDs are involved.
>
>My technical understanding of namespaces isn't the best, but I think
>the root of the problem is that DTDs don't support namespaces
>properly.  (For way too much detail, see [1].)  In an ideal world, DTD
>validation wouldn't treat xmlns attributes as "real" attributes, and
>you wouldn't have a problem.
>
>That said, it seems like it should be valid to specifically add (with
>p:add-attribute) or match/delete (with p:delete) namespace declaration
>attributes.  From the namespace spec [2]:
>
>"The prefix xmlns is used only to declare namespace bindings and is by
>definition bound to the namespace name http://www.w3.org/2000/xmlns/.
>It MUST NOT be declared . Other prefixes MUST NOT be bound to this
>namespace name, and it MUST NOT be declared as the default namespace.
>Element names MUST NOT have the prefix xmlns."
>
>So if xmlns is implicitly bound to http://www.w3.org/2000/xmlns/, I
>would think that you could use the xmlns prefix to match the
>attributes.  In that case, it would be a bug in the processor that you
>can't.  But I'll leave it to more knowledgeable folks to tell me that
>I'm wrong :)
>
>[1] http://www.rpbourret.com/xml/NamespacesFAQ.htm#dtd_6
>[2] http://www.w3.org/TR/REC-xml-names/
>
>-James
>
>
>
>On Tue, May 4, 2010 at 11:09 AM, HILLMAN, Tomos <tomos.hillman@oup.com> wrote:
> > Hi List,
> >
> > I'm wondering if anyone has a solution to the following problem:
> >
> > I'm writing a chunking script that takes one large document and 
> filters out particular 'chunks' - say //div1.  The document needs 
> to be valid against a declared DTD.
> >
> > The root element of the original document has an xmlns:law 
> attribute declaration for <law:extract> elements; these may also 
> occur multiple times within each 'chunk'.
> >
> > Unfortunately, although the xmlns:law attribute is allowed both 
> on the original root element and the law:extract element, it is not 
> allowed on the filtered elements by the DTD: however, this is where 
> my xproc processor (calabash under oxygen) adds them.
> >
> > How should I go about removing these attributes from the div1 
> elements and add them to the law:extract elements?  Trying to treat 
> the elements as simple attributes gives errors like 'xmlns 
> namespace not declared'...
> >
> > Help!
> > Tom
> > Oxford University Press (UK) Disclaimer
> >
> > This message is confidential. You should not copy it or disclose 
> its contents to anyone. You may use and apply the information for 
> the intended purpose only. OUP does not accept legal responsibility 
> for the contents of this message. Any views or opinions presented 
> are those of the author only and not of OUP. If this email has come 
> to you in error, please delete it, along with any attachments. 
> Please note that OUP may intercept incoming and outgoing email communications.
> >
> >


======================================================================
Wendell Piez                            mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
Received on Tuesday, 4 May 2010 18:36:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 May 2010 18:36:25 GMT