Re: Entity Expansion from David A. Lee on 2008-12-27 (xproc-dev@w3.org from December 2008)

From: David A. Lee <dlee@calldei.com>
Date: Sat, 27 Dec 2008 11:40:47 -0500
To: "Norman Walsh" <ndw@nwalsh.com>, "XProc Dev" <xproc-dev@w3.org>
Message-ID: <C313CBE4DC02483DB6A8BFDDE4C114CD@calldei.com>

Thanks !
Quote NW:
---
Ah. I think you're mistaken there. The base URI infoset property
records the "URI used to retrieve the entity" per section 5.1 of
RFC 2396.
----

This was my missing mis-understanding.   Thanks.

Which leads to another question which I don't expect an answer (but would 
love to have one)...
What technology exists which preserves all required infoset properties,
and could be used as the implementation of the "pipe" in xproc ?
I've thought about StaX which I believe calabash uses (some of?) but I've 
discovered and you've commented its not good enough ... (looses the base 
URI's ... )
I believe an *early intent* of the spec was that text serialization *could* 
be used by an implementation ... but I've yet to figure out a text 
serialization that preserves the base URI's without adding extra attributes 
(which is in violation of the spec).   I suppose magic attributes could be 
added to the "internal serialized format" that are then stripped out by the 
processor between steps.
I did some research on "Binary XML" but I don't think the spec is quite 
mature enough yet but I could be wrong ...   Clearly a totally in-memory 
structure (such as DOM or saxon Tree's) could be made to work.  And there 
are various proprietary things (I think some of Oracle's streaming API's 
might work ... but I haven't really dug into those).

It probably comes down to "roll your own".  Other suggestions are very much 
welcome !
Right now in xmlsh I use text serialization as a stream, but I've never 
really liked that and intend
to change it (although I have to preserve it as an option for other reasons, 
xmlsh streams can  also be pure text, not xml).


Finally to your last point:

Quote NW:
---------------
If your implementation doesn't expand entities, I think you could
argue that you pass that test if the results you give are consistent
with unexpanded entities.
--------------

This would lead me to think that perhaps a more complex (yuck) test format 
is necessary.
One that allows for implementation allowed variances.

I'm not going to push hard for this one case, but I suspect more will arise 
where test cases are coded with implicit assumptions that in fact are not 
the only allowed result.  Ultimately this will have to be addressed as there 
need be an "objective" determination of what "passes a test".
I could provide my own test suite with my own output that I claim are 
"passes" but who knows ... who decides if my interpretation of "pass" is 
actually correct ?

-David
-----------------------------------------------------------
David A. Lee
dlee@calldei.com
http://www.calldei.com
http://www.xmlsh.org

Received on Saturday, 27 December 2008 16:41:37 UTC