W3C home > Mailing lists > Public > public-grddl-wg@w3.org > May 2007

RE: How are correct, unambiguous results possible with implementation-defined XML pre-processing?

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Fri, 25 May 2007 20:14:12 -0400 (EDT)
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: Murray Maloney <murray@muzmo.com>, public-grddl-wg@w3.org
Message-ID: <Pine.LNX.4.64.0705251936260.10977@tribal.metalab.unc.edu>

As chair, I am going to make a note here about the relationship of our WG 
to others relating to this issue.
On Fri, 25 May 2007, Booth, David (HP Software - Boston) wrote:

  >>
>> Secondly, we expect that early transformations will be
>> written using XSLT 1 & 2.
>> So, we cannot require transformations to perform XInclude or
>> validation.
>
> But the spec could provide a way for a GRDDL transformation to specify
> what pre-processing should occur prior to invoking the XSLT script.

The problem is that the notion of preprocessing is underdefined for XML 
parsers in general. Can someone point me to a document that specifies 
exactly what finite number steps must be taken to preprocess an XML 
document so one can apply XPath to get a node (and here come up 
questions about how one gets from bytes on the wire to a data model). It seems, since
the XML Spec stack has grown, there is no normative way to determine this,
and so the as the  question is much more complex than just the interaction 
of Xincludes (for  example, what about DTD or Schema validaton, and their 
interaction with  Xincludes?). Therefore, our reliance  on the XML 
Processing Model WG, and  we have also in the  past before Last Call asked 
the XQuery and XSL WG for advice.

So, while I heavily sympathize with Davids concern it seems this problem 
of being able to define preprocessing for  a XML parser in general belongs 
in the domain of the TAG of the XML  Processing Model WG, not the GRDDL WG 
per se.

Intead of remaining silent on the issue, Murray wrote a warning bringing 
this up and encouraging GRDDL transformations to take this into account.

In other words, any alternative (and again, exact text would be great) 
would require exactly what one means when one says "The GRDDL Agent should 
not perform any preprocessing". To me this statement is also underdefined, 
as one has to get to a XPath node somehow and those steps are 
underdefined and in practice can be varied.

step in an XProc transformation could be 'delete >> all xincludes'.
>> So, you can be quite explicit about the policy that you want
>> to implement in
>> an XProc XML Pipeline transformation.
>>
>> However, if the expansion has already happened -- because,
>> for example, local
>> policy requires expansion of all xincludes as documents go
>> through a local proxy, then you are out of luck.
>
> Right.  So regarding the following advice in sec 6:
> http://www.w3.org/TR/grddl/#txforms
> [[
> Therefore, it is suggested that GRDDL transformations be written so that
> they perform all expected pre-processing, including processing of
> related DTDs, Schemas and namespaces.
> ]]
> it sounds like you would agree with my conclusion that this advice is
> untenable in this case, because it is not possible to write a transform
> that reliably prevents xi:include from being processed.

I am again going to point out this is a probem with XML not having it 
preprocessing (or processing model in general) defined, and so this 
problem is not unique to GRDDL. However, xincludes is a special case of 
"preprocessing, including processing of related DTDs, Schemas, and 
namespaces." And if one forbids explicitly Xinclude expansion, is one also 
forbidding DTD or Schema validation, or other forms of preprocessing? The 
question is tricky and the GRDDL WG has taken so far a conservative route, 
but one that is indeed coherent.


...>
> Thanks,
> David Booth
>
>
>

-- 
 				--harry

 	Harry Halpin
 	Informatics, University of Edinburgh
         http://www.ibiblio.org/hhalpin
Received on Saturday, 26 May 2007 00:14:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:52:38 UTC