Re: XInclude in XML Calabash

Norman Walsh <ndw@nwalsh.com> writes:

> So.
>
> I’m working through the XInclude 1.0/1.1 test suite and I’ve discovered that
> my implementation of XInclude is … buggy.
>
> Correct behavior for XInclude is that nested XInclude elements are expanded
> before evaluating fragids. So you can say
>
>   <xi:include href="book.xml" xpointer="id(chapter1)"/>
>
> and it works even if the xml:id ‘chapter1’ is “behind” an xi:include
> element in book.xml.
>
> Fair enough.

Yes, and our old stand-off markup approach to managing overlapping
markup for annotated linguistic data depends on this.

> Unfortunately, fixing that bug has a consequence. Attribute types
> defined by DTD validation are lost in the expanded document.

Why?  [attribute type] is an infoset property, should be preserved in
the transcluded bit, shouldn't it?

>  What this means in practice is that if you’re using DTD validation to
> assign ID types *and* using fragids to point to those IDs, it doesn’t
> work. (It does work if you name your ID attributes xml:id, which I
> hope you all do.)

I don't understand, yet. . .

> I don’t know how to fix this. It’s clear from the Saxonica API docs
> that NodeInfo.getSchemaType() doesn’t return types declared with DTD
> fragments.

Is there no getAttributeType()?

> In theory, I could use NodeInfo.getTypeAnnotation() to find
> out, but the values I get back from that API don’t have the
> IS_DTD_TYPE bit set even when the type comes from the DTD.

Do you mean, so getTypeAnnotation() _should_ work?

> (All of this despite the fact that in the parsed document, before I
> copy it, the XPath id() function does work.)

So we need to ask Michael what that's exploiting?

> I suppose I should try construct a test case and report the bug but
> that’s not going to be useful today. And even if I could get the DTD
> type, it’s entirely unclear that I could construct a new tree with
> that type, so I’m not sure it’d help.

But the DTD type should travel with the transcluded bit, shouldn't it?

> On the whole, I think the best choice is to fix the “nested includes”
> bug and just accept that DTD-based ID attribute types won’t work.
> But how painful is that going to be for users, I wonder?

In principle, quite serious.  I don't know how much use that approach to
overlap is getting these days in practice.

ht
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Received on Sunday, 17 April 2016 16:25:06 UTC