- From: Norman Walsh <ndw@nwalsh.com>
- Date: Sun, 17 Apr 2016 14:31:09 -0500
- To: XProc Dev <xproc-dev@w3.org>
- Message-ID: <87ega4118i.fsf@nwalsh.com>
"Henry S. Thompson" <ht@inf.ed.ac.uk> writes:
> Norman Walsh <ndw@nwalsh.com> writes:
>> Correct behavior for XInclude is that nested XInclude elements are expanded
>> before evaluating fragids. So you can say
>>
>> <xi:include href="book.xml" xpointer="id(chapter1)"/>
>>
>> and it works even if the xml:id ‘chapter1’ is “behind” an xi:include
>> element in book.xml.
>
> Yes, and our old stand-off markup approach to managing overlapping
> markup for annotated linguistic data depends on this.
Indeed. That’s clearly the expected behavior and my implementation
just got it wrong.
>> Unfortunately, fixing that bug has a consequence. Attribute types
>> defined by DTD validation are lost in the expanded document.
>
> Why? [attribute type] is an infoset property, should be preserved in
> the transcluded bit, shouldn't it?
The answer to why is simply that the Saxonica APIs don’t make it
easy.
>> I don’t know how to fix this. It’s clear from the Saxonica API docs
>> that NodeInfo.getSchemaType() doesn’t return types declared with DTD
>> fragments.
>
> Is there no getAttributeType()?
AFAICT, the closest thing to getAttributeType() is getSchemaType().
>> In theory, I could use NodeInfo.getTypeAnnotation() to find
>> out, but the values I get back from that API don’t have the
>> IS_DTD_TYPE bit set even when the type comes from the DTD.
>
> Do you mean, so getTypeAnnotation() _should_ work?
Well…the JavaDoc says:
/**
* Get the type annotation of this node, if any. The type
* annotation is represented as an integer; this is the
* fingerprint of the name of the type, as defined in the name
* pool. Anonymous types are given a system-defined name. The
* value of the type annotation can be used to retrieve the actual
* schema type definition using the method {@link
* Configuration#getSchemaType}.
*
* The bit IS_DTD_TYPE (1<<30) will be set in the case of an attribute
* node if the type annotation is one of ID, IDREF, or IDREFS and this
* is derived from DTD rather than schema validation.
*
* @return the type annotation of the node, under the mask
* NamePool.FP_MASK, and optionally the bit setting
* IS_DTD_TYPE in the case of a DTD-derived ID or IDREF/S
* type (which is treated as untypedAtomic for the
* purposes of obtaining the typed value).
*
* For elements and attributes, this is the type annotation as
* defined in XDM. For document nodes, it should be one of
* XS_UNTYPED if the document has not been validated, or
* XS_ANY_TYPE if validation has taken place (that is, if any
* node in the document has an annotation other than Untyped
* or UntypedAtomic).
*
* @since 8.4. Refinement for document nodes introduced in 9.2
*/
I’m not sure how useful it actually is to know that an attribute was
of type ID or IDREF(S) but not which one. In any event, when I poked
at this with the debugger, the IS_DTD_TYPE bit was not set.
>> (All of this despite the fact that in the parsed document, before I
>> copy it, the XPath id() function does work.)
>
> So we need to ask Michael what that's exploiting?
I suppose, since you’re making me feel guilty for trying to just
ignore it :-)
>> I suppose I should try construct a test case and report the bug but
>> that’s not going to be useful today. And even if I could get the DTD
>> type, it’s entirely unclear that I could construct a new tree with
>> that type, so I’m not sure it’d help.
>
> But the DTD type should travel with the transcluded bit, shouldn't it?
Yes it should. Whether it *can* or not in the Saxon APIs is a
different question. A question compounded by the fact that my code
uses the 9.6 APIs and not (yet) the 9.7 APIs.
>> On the whole, I think the best choice is to fix the “nested includes”
>> bug and just accept that DTD-based ID attribute types won’t work.
>> But how painful is that going to be for users, I wonder?
>
> In principle, quite serious. I don't know how much use that approach to
> overlap is getting these days in practice.
Right. So I’ll start with a simple message to the Saxonica list to see
if they believe that what I need to do is possible in principle with
the APIs I have available.
Be seeing you,
norm
--
Norman Walsh
Lead Engineer
MarkLogic Corporation
Phone: +1 512 761 6676
www.marklogic.com
Received on Sunday, 17 April 2016 19:31:38 UTC