Latest version of the Infoset

John,

Joseph pointed me to the XML-DSig WG comments and it seems
that  you're not aware of the latest changes in the
Infoset

From
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001JanMar/att-0149/00-part

> Information Set provides for Entity and CDATA start and end mark information
> items.

This is no longer true. Entity and CDATA start and end mark information items
were removed from the latest of the Infoset:
http://www.w3.org/TR/2001/WD-xml-infoset-20010316

> Yes.  Suppose I have an external entity of the following form:
> <?PI1?> <e> <?PI2?> </e> <?PI3?>
> such that substitution into a document of the form
> <doc> &extEnt; <?PI4?> </doc>
> becomes
> <doc><?PI1?> <e> <?PI2?> </e> <?PI3?> <?PI4?> </doc>
> I want to retain the original meaning of element 'e' as well as PI1, PI2 and PI3. 
> One way to do this is to add an xml:base attribute that preserves the base URI
> of the external entity in the replacement text.  Problem is, since xml:base is
> an attribute, it will cover PI2 by adding it to 'e', but it will not cover PI1
> and PI3, which therefore experience a change in meaning when sustituted into
> the document if they use the base URI *and* if the document has a different
> base URI than the external entity (which is quite reasonable).
> If I knew the start and end of the external entity substitution text, as well
> as its base URI, I could do something like the following:
> <doc><c14n:baseURI xmlns:c14n="..." xml:base="refer to extEnt"><?PI1?> <e> <?PI2?> </e> <?PI3?></c14n:baseURI> <?PI4?> </doc>
> The resulting XML may not be directly usable since the addition of such an element
> has made 'e' the grandchild of doc, where it used to be the child, so embedded
> self-referential XPaths may fail.  However, this technique is useful for signatures
> since it generates an XML string that is 1) easily seen to mean the same thing as
> the source document because it is XML, and 2) is different based on whether a PI
> like PI3 or PI4 are in the external entity or in the main document.

The Infoset does no longer preserve the base URI of PIs if you serialize the
document:
[[[
2.4. Processing Instruction Information Items
[...]
[base URI] The base URI of the PI, as computed by the method of XML Base [XML
Base]. If the PI appears directly in the document entity, and the URI of the
document entity is not known, and there is no xml:base declaration in effect,
then the value of this property is unknown. Note that if an infoset is
serialized as an XML document, it will not be possible to preserve the base URI
of any PI that originally appeared at the top level of an external entity,
since there is no syntax for PIs corresponding to the xml:base attribute on
elements.
]]]

-- XML Information Set
http://www.w3.org/TR/2001/WD-xml-infoset-20010316/#infoitem.pi
Fri, 16 Mar 2001 22:42:59 GMT

In your case, one solution would be to encapsulate each PI in a c14n:baseURI
element:
<doc><c14n:baseURI xmlns:c14n="..." xml:base="refer to
extEnt"><?PI1?></c14:baseURI> <e> <?PI2?> </e> <c14n:baseURI xmlns:c14n="..."
xml:base="refer to extEnt"><?PI3?></c14n:baseURI> <?PI4?> </doc>

Regards,
Philippe

Received on Tuesday, 27 March 2001 19:20:27 UTC