Re: [xml-dev] Serialization of XDM - Use cases / Proposal from David A. Lee on 2009-09-29 (xproc-dev@w3.org from September 2009)

From: David A. Lee <dlee@calldei.com>
Date: Tue, 29 Sep 2009 08:16:45 -0400
To: Michael Kay <mike@saxonica.com>
CC: "'Philippe Poulard'" <philippe.poulard@sophia.inria.fr>, "'Kurt Cagle'" <kurt.cagle@gmail.com>, rjelliffe@allette.com.au, xml-dev@lists.xml.org, "'XProc Dev'" <xproc-dev@w3.org>
Message-ID: <4AC1FAAD.808@calldei.com>
I'm still working on this never fear !!
Micheal's suggesting is so compelling that I had to do some more 
research.   I still need to do some more :(
I'm going to start writing this up so I have a strawman to poke at 
(hopefully not totally light fire to).

A couple suggestions/questions so I get going.
What is being proposed (by michael, and I'm going along for the ride) is 
attempting to use a  very small subset of XSLT , but then augmented,
to provide the core elements which are used to wrap the XDM result.
As a result it *wont actually be XSLT*.   But a new schema conceptional 
copied from the bits we want to reuse.

The first question I have in mind is how do we parse this.  This one 
example of Michaels has me a little confused:

    <xsl:sequence select="xs:positiveInteger('5')"/>

This is the proposal for how to represent a typed atomic value.    This 
is pretty obscure to my novice eyes.  Reading this I wouldn't guess off 
hand that this means "Atomic value, type xs:positiveInteger, text value 
'5'".

But if thats how you construct typed atomic values in XSLT I could live 
with it,  Which comes to the aforementioned problem,
this isn't actually XSLT,  its a schema borrowed from xslt.
Particularly if we have to not just take a subset but also augment as per
    "There might be a need to define some additional attributes specific 
to the serialization format, e.g. to represent IDness."



This means how do we parse it ?
Certainly a dedicated XML parser could do the trick but then we'd have 
to be pretty sophisticated to be able to parse the above syntax.
Alternatively maybe this schema can be transformed (via xlst? xquery? 
pure java?) into an actual fully fledged XSLT file, and then *that* run.

That then leads me to the final question.  Suppose we transform this 
serialized form "almost an xslt" format, into "real xslt" format, then
run a real XSLT 2.0 parser on it.  How to get the resulting values out ?

Please bear with me as I'm very much a novice at XSLT ... maybe the 
answer is "obvious".
XSLT 2.0 claims that the result of an XSLT transformation can be a 'set 
of result trees'.
Thats an XDM sequence . (???)
So far so good I think.

But how in reality using a real XSLT processor to get these ?  I'm 
looking at my favorite XSLT parser, Saxon, and
I cant see any methods to get more then 1 tree out.    Unlike the XQuery 
methods which can produce Xdm objects (including sequences),
the XSLT classes all have 1 method only "transform" which presumes a 
single Destination object.
So is there a way to re-animate such a format using XSLT ?
If so what would it be ?
If not, then I suggest using a syntax which is XSLT based has lesser 
value if you cant actually use XSLT on it.  It has value from the 
'shared understanding' perspective, and hence maybe some of the 
suggestions make sense if they are easily parsed, but the one quoted at 
the top seems unnecessarily complicated to parse if you cant use XSLT to 
do it.
As apposed to something like say

    <xdm:atomic type="xs:positiveInteger" value="5"/>

Which doesn't require parsing an arbitrary expression inside a select= 
attribute.






David A. Lee
dlee@calldei.com  
http://www.calldei.com
http://www.xmlsh.org
812-482-5224



Michael Kay wrote:
> Some people have been suggesting using a subset of XQuery syntax, 
> others have been saying it would be better to use XML syntax.
>  
> It occurs to me one might achieve both objectives at the same time by 
> using a subset of XSLT syntax. That is, we could define the syntax to 
> be a named xsl:template instruction containing a sequence constructor 
> in which only the following are permitted:
>  
> (a) An <xsl:sequence> instruction whose select attribute contains a 
> constructor function with a string literal argument, for example 
> <xsl:sequence select="xs:positiveInteger('5')"/>
>  
> (b) An empty <xsl:attribute>, <xsl:value-of>, <xsl:comment>, 
> <xsl:processing-instruction> or <xsl:namespace> instruction whose 
> content is constrained to use no non-literal expressions or AVTs.
>  
> (c) An <xsl:element> or <xsl:document> instruction whose content is 
> constrained to hold only <xsl:element>, <xsl:attribute>, 
> <xsl:value-of>, <xsl:comment>, <xsl:processing-instruction> or 
> <xsl:namespace> instructions that themselves follow the same rules.
>  
> There might be a need to define some additional attributes specific to 
> the serialization format, e.g. to represent IDness.
>  
>
> Regards,
>
> Michael Kay
> http://www.saxonica.com/
> http://twitter.com/michaelhkay
>
>
>     ------------------------------------------------------------------------
>     *From:* Michael Kay [mailto:mike@saxonica.com]
>     *Sent:* 21 September 2009 14:45
>     *To:* 'David A. Lee'
>     *Cc:* 'Philippe Poulard'; 'Kurt Cagle'; rjelliffe@allette.com.au;
>     xml-dev@lists.xml.org; 'XProc Dev'
>     *Subject:* RE: [xml-dev] Serialization of XDM - Use cases / Proposal
>
>
>      
>
>         Ouch.  If this cant be done in xquery syntax then my goal of
>         de-serializing an XML representation using a XQuery example
>         implementation is out the door.
>
>         Here's my best shot ...
>
>
>         attribute
>            { fn:QName( "U" , "P:N" ) }
>            { my:IdType( "S" ) }    (: wont work will it :( :)
>
>
>         Ok I admit I'm totally stumped.  *IS* there a way to
>         re-animate this example using XQuery (or XSLT?)  ?
>         I have a feeling that my goal of providing a reference
>         implementation in XQuery will be impossible.  Not even sure
>         how to get element type information re-animated.
>
>
>          
>         I think that in XSLT, the following comes close:
>          
>         <xsl:attribute name="P:N" namespace="U" type="my:IdType"
>         select="'S'"/>
>          
>         provided that the recipient has a schema (the correct schema)
>         for the global attribute declaration my:IdType. There are
>         problems if the type is anonymous (you might have to construct
>         a variant of the original schema in which all types have
>         names). As for the isID property, it is ALMOST redundant in
>         XDM: it can in nearly all cases be inferred from the type
>         annotation. The exception is where IDness was established as a
>         result of DTD validation rather than schema validation. In
>         that case, yes, I think you're going to have difficulty
>         reconstituting the original sequence using tools written in
>         XSLT or XQuery. (Actually, it hadn't occurred to me this was
>         one of your goals.)
>          
>         XQuery 1.0 (unlike XSLT 2.0) doesn't allow validation against
>         a type name, and doesn't allow validation of individual
>         attributes. 
>          
>         Other limitations of using XSLT/XQuery
>          
>         (a) neither language gives you any way of creating unparsed
>         entities
>          
>         (b) XQuery 1.0 gives you no way of creating arbitrary
>         namespace nodes
>          
>
>         Regards,
>
>         Michael Kay
>         http://www.saxonica.com/
>         http://twitter.com/michaelhkay
>
>          
>
Received on Tuesday, 29 September 2009 12:17:45 UTC