Re: Entity Expansion from Norman Walsh on 2008-12-27 (xproc-dev@w3.org from December 2008)

From: Norman Walsh <ndw@nwalsh.com>
Date: Sat, 27 Dec 2008 10:42:28 -0500
To: XProc Dev <xproc-dev@w3.org>
Message-ID: <m2iqp5wrob.fsf@nwalsh.com>
"David A. Lee" <dlee@calldei.com> writes:

> NW Quote:
>> If its not explicit then an otherwise conforming implementation could
>> fail this test.    Or for example, adding a p:identity earlier in the
>> pipeline would cause the test to fail, which on the surface atleast
>> seems non-obvious.
>
> How would that cause the test to fail?
> -----------
>
> It would fail because the output expected is:
[...]
> Without entity expansion it would be
[...]
> ---------------
>
> If you compare these they will not turn out to be identical due to the
> lack of the base URI in the "&subdoc;" or its resultant expansion.

I understand how not expanding entities would cause it to fail, but I
don't see how "adding a p:identity earlier in the pipeline" would cause
entities not to be expanded.

> -------- NW Quote:
>
> Most specs that operate on infosets these days expect entities to be
> expanded. I'm not sure there's any practical way to "fix" that.
>
> -------
>
> I agree that entity expansion is the default behaviour in most API's
> ... although not sure thats what your saying.
> If your talking about "specs" ... well you (and friends) wrote the
> XProc spec  so you can "fix" it any way you want :)

Well. Yes and no.

We could require that implementors support the ability to pass
documents that have unexpanded entities between steps, but doing so
would place a very large burden on implementors because, as you say,
many APIs don't support that behavior. We have, in general, tried to
avoid making decisions that would make implementation dramatically
more difficult.

And even if we did, and even if implementors went along, you couldn't
pass those documents to steps that require entities to be expanded,
like schema validation, XSLT, and XQuery, to name the first three that
come to mind.

That's why I said "practical" way to fix it.

> Fortunately I'm using Saxon so I have access to this (well I havent
> tested it yet but the method is there), but it was one issue I'm
> considering when thinking about re-implmenting pipelines as StaX
> events ... if you expand entities you loose the baseURI  ... so this
> particular case of add-base-uri step is probably only correctly
> (easily) implementable if its the first step in a pipeline.

I think that's a bug you'd have to work around in the StAX APIs. One
of my early implementation efforts was with StAX and I think I wound
up adding new even types to deal with sequences and base URIs.

> After entity expansion and then "serialization" of some sort (or using
> an API that looses track of where the expanded entities came from) it
> cant add the xml:base attributes anymore to anything except the root
> node.

Yes. An implementation that serializes between steps is going to have
to add xml:base attributes to keep track of base URIs (or serialize
intermediate results in some totally non-standard,
implementation-dependent way).

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | The art of living is more like
http://nwalsh.com/            | wrestling than dancing.--Marcus Aurelius
Received on Saturday, 27 December 2008 15:43:10 UTC