W3C home > Mailing lists > Public > xproc-dev@w3.org > December 2008

Re: Entity Expansion

From: David A. Lee <dlee@calldei.com>
Date: Sat, 27 Dec 2008 10:25:28 -0500
Message-ID: <9191557073CF4C818C1CEDB67F6C83EB@calldei.com>
To: "Norman Walsh" <ndw@nwalsh.com>, "XProc Dev" <xproc-dev@w3.org>

NW Quote:
> If its not explicit then an otherwise conforming implementation could
> fail this test.    Or for example, adding a p:identity earlier in the
> pipeline would cause the test to fail, which on the surface atleast
> seems non-obvious.

How would that cause the test to fail?

It would fail because the output expected is:
---- output
<doc xml:base='http://tests.xproc.org/tests/doc/xml-base-test.xml'>
<p>This has one base URI: <uri>xml-base-test.xml</uri></p>

<?pi in base-test?>

<div xml:base='http://tests.xproc.org/tests/doc/xml-base-chap.xml'>

<p>This has a different base URI: <uri>xml-base-chap.xml</uri>.</p>

<?pi in base-chap?>


<p>This has the original base URI.</p>



Without entity expansion it would be


<!DOCTYPE doc [

<!ENTITY subdoc SYSTEM "xml-base-chap.xml">


<doc xml:base='http://tests.xproc.org/tests/doc/xml-base-test.xml'>

<p>This has one base URI: <uri>xml-base-test.xml</uri></p>

<?pi in base-test?>


<p>This has the original base URI.</p>



If you compare these they will not turn out to be identical due to the lack 
of the base URI in the "&subdoc;" or its resultant expansion.

-------- NW Quote:

Most specs that operate on infosets these days expect entities to be
expanded. I'm not sure there's any practical way to "fix" that.


I agree that entity expansion is the default behaviour in most API's ... 
although not sure thats what your saying.
If your talking about "specs" ... well you (and friends) wrote the XProc 
spec  so you can "fix" it any way you want :)

If your talking about "implementation" its tougher.  Most implementations of 
XML Parsers I know of do allow it to be optional if you expand entities, 
however I've only found 1 so far ( saxon's Tree models) that give you access 
to the base URI of expanded entities after they are expanded.   I admit I 
havent looked "far and wide" but did find this lacking in say the StAX 
So to implement the add-base-uri correctly in the presence of expanded 
external entities one has to be very choosey about which API/Library one is 
using and/or handle expansion of external entities yourself.

Fortunately I'm using Saxon so I have access to this (well I havent tested 
it yet but the method is there), but it was one issue I'm considering when 
thinking about re-implmenting pipelines as StaX events ... if you expand 
entities you loose the baseURI  ... so this particular case of add-base-uri 
step is probably only correctly (easily) implementable if its the first step 
in a pipeline.

After entity expansion and then "serialization" of some sort (or using an 
API that looses track of where the expanded entities came from) it cant add 
the xml:base attributes anymore to anything except the root node.

David A. Lee
Received on Saturday, 27 December 2008 15:26:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:03:04 UTC