RE: issue-dbooth-3: Ambiguity in an XML document's intended GRDDL results from Booth, David (HP Software - Boston) on 2007-05-30 (public-grddl-comments@w3.org from April to June 2007)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Wed, 30 May 2007 02:24:54 -0400
To: "Harry Halpin" <hhalpin@ibiblio.org>
Cc: <public-grddl-comments@w3.org>, "Jeremy Carroll" <jjc@hpl.hp.com>, "McBride, Brian" <brian.mcbride@hp.com>
Message-ID: <EBBD956B8A9002479B0C9CE9FE14A6C202B5AEB3@tayexc19.americas.cpqcorp.net>
Harry,

Thanks for your time in addressing this.  Detailed answers below.

> From: Harry Halpin [mailto:hhalpin@ibiblio.org] 
> 
> I sympathize with the general line of comments, but do not 
> see how GRDDL 
> can remain WebArch compliant and not specify its own XML 
> processing model.

Okay, then I'll suggest some ways it can be done.  First, a key point is
that the GRDDL spec cannot mandate a particular XML processing model
(except an empty XML processing model), because the XML processing model
needs to vary depending on the given XML instance document.  Therefore,
the desired XML processing model *must* somehow be indicated by the XML
instance document, either via a GRDDL annotation or some other
annotation.  Off the top of my head, here are three WebArch compliant
ways it can be done.

1. Each GRDDL transformation could be required to specify an additional
URI that indicates the required XML processing sequence for that
document.   For example, http://example.org/procseq/noop might mean that
*no* processing beyond parsing should occur.
http://example.org/procseq/xv might mean that schema validation should
be performed, followed by XInclude expansion.  Anyone could mint their
own URI and specify whatever processing sequence (preferably in an
English prose or other document that can be found by dereferencing that
URI).  

Pro: This approach is completely extensible: any XML processing sequence
could be specified.

Con: Without any other spec like XProc specifying how the desired XML
processing sequence should be expressed, the semantics of each such URI
would have to be built in to the GRDDL processor that wishes to use that
document type.  But since some processing sequences are likely to be
usable for a number of document types, URIs for these XML processing
sequences may become well known and commonly implemented by GRDDL
processors.  Furthermore, once XProc is finished, the desired processing
sequence could be written in the XProc language, and thus it could
become machine processsible.

2. A GRDDL transformation could be required to explicitly perform any
XML preprocessing it needs.  I.e., the initial Node tree it would
receive would be the node tree resulting from bare parsing only -- no
schema validation, no XInclude processing, etc.  

Pro: When XProc is release, GRDDL transformations could very naturally
be defined in the XProc language.  As a final step, an XProc sequence
would presumably invoke an XSLT or other script.

Con: This may be tedious to do until XProc is completed.  However, it is
possible the idioms could be developed to ease the pain.

3. As a forward reference, the GRDDL spec could explicitly state a
dependency on XProc.  If an XML instance document indicates an XProc XML
processing sequence that is applicable to its root node, then that must
be used prior to applying the GRDDL transformations.  If the XML
instance document does not indicate any XProc XML processing sequence,
then as a fallback the GRDDL spec might require either mechanism #1 or
#2 described above to take effect.

Pro: 

> 
>   On Tue, 29 May 2007, Booth, David (HP Software - Boston) wrote:
> 
> [snip]
> >
> >
> > Definition: By "XML instance document" I am referring to a concrete
> > "representation" in the TAG WebArch sense -- not an "information
> > resource".
> 
> Have you seen our test case document [1]? Again, many of 
> these issues are 
> dealt with explicitly in the test case document.

Yes, I have seen the test case document, and yes, some of these issues
are nicely illustrated in test cases.  But this does not solve the
problem at all.  Documenting a bug does not fix it. 

> 
> In particular, see the following section
> 
> > POINT 1: For any XML instance document, to the extent possible, the
> > GRDDL spec should make it clear exactly what are the intended GRDDL
> > results for that XML instance document.   Two 
> > implementations faithfully
> > implementing the GRDDL spec should come to the same 
> > conclusions about
> > what those intended GRDDL results should be, i.e., there 
> > should be no ambiguity.
> >
> > I do not think the GRDDL specification should be considered finished
> > until the spec makes this clear, given that:
> > - GRDDL is the cornerstone for bridging the worlds of XML and RDF.
> > - A key purpose in expressing semantics in RDF is to make them
> > *unambiguous*.
> > - GRDDL is on track to become a W3C Recommendation.
> > - GRDDL may have quite a long life.  Both XML and RDF have 
> > been around
> > for several years with little change, and show no signs of being
> > replaced.  I see no reason why GRDDL should not have a 
> > similar lifespan.
> 
> I agree. But XML has remained around with preprocessing 
> indeterminacy for 
> quite a long time and has been useful, and XSLT is Turing 
> complete and not 
> deterministic, yet has also proven to be useful and have a long life.

Yes, but they are not expressing semantics in RDF.  If you are arguing
that ambiguity in the intended GRDDL results is okay, I vehemently
disagree.

> 
> > POINT 2: At present, it is not clear what is the view of the Working
> > Group (WG) toward ambiguity in an XML document's intended 
> > GRDDL results,
> > i.e., whether the WG believes:
> >
> >  a. it is a problem, but we do not know a solution;
> >  b. it is a problem now, but we expect the problem to go away
> >     when the XProc or some other spec is completed; or
> >  c. the WG does not consider it a problem.
> >
> > I would vehemently object to position c, for the reasons 
> > above.  In the
> > case of position a, I believe there *are* ways to reduce or 
> > eliminate
> > such unintended ambiguity, and I will be happy to suggest 
> > ways to do so.
> > In the case of position b, I think it is important that the WG make
> > clear exactly *how* XProc or some other spec is intended to make the
> > problem go away, and indicate that in the spec.  At 
> > present, the spec
> > explicitly allows the intended results to be implementation defined,
> > which IMO is unacceptable for a spec of this kind.
> 
> The spec is not ambiguous, and neither are the test cases. 
> However, they are not 
> determinisitic across implementations in precisely the cases 
> you describe.

There are a few spots where the spec is a little unclear, but I mostly
agree.  I think the WG has done an admirable job of making the spec
itself quite unambiguous, and I very much like the way the normative
rules are called out and written formally.  However, the non-determinism
that the spec explicitly permits is the big problem.   I can see that
many xhtml-oriented applications would be fine with the intended GRDDL
results for an instance document being ambiguous, but for pure XML
applications where GRDDL is used to expose the entire semantics of the
XML instance document -- essentially treating the XML as a custom
serialization of RDF -- ambiguity in the intended GRDDL results is
absolutely not okay.  

For comparison, can you imagine anyone proposing a serialization of RDF
the permits the intended RDF graph for a particular instance document to
be implementation-defined depending on parsing choices?  Of course not.
They'd be laughed out of the room.  But that is *exactly* what we get if
the GRDDL spec permits the intended GRDDL results to be
non-deterministic. 

> I also see you have not responded to my previous email 
> regarding the lack of determinism built into XML [2].

Yes, I just hadn't caught up to it yet.  :)

> > 
> > POINT 3: The spec needs to define a notion of "complete 
> > GRDDL results"
> > for a given XML instance document.  It is good that the 
> > specification
> > describes how partial GRDDL results can be determined, 
> > because partial
> > results may be adequate for many applications.  But the 
> > spec also needs
> > to clearly define what  constitutes the *complete* GRDDL results
> > indicated by a given XML instance document, i.e., all and only the
> > intended GRDDL results for all GRDDL transformations 
> > indicated by that XML instance document.
> >
> > This is particularly important in supporting applications 
> > in which GRDDL
> > is used to express the *entire* semantics of an XML 
> > instance document,
> > such as a messaging application as described in issue-dbooth-9a,
> >  http://lists.w3.org/Archives/Public/public-grddl-comments/2007
> > AprJun/0069.html
> > i.e., where custom XML document types are created or 
> > treated as custom
> > serializations of RDF, as described in
> > http://dbooth.org/2007/rdf-and-soa/rdf-and-soa-paper.htm .
> > One must be able to say with clarity: "For this XML 
> > instance document,
> > the complete GRDDL results are intended to be precisely the 
> > following RDF triples -- no more and no less."
> 
> Given the fact that GRDDL is a client-side process that may rely upon 
> accessing namespace or profile documents, it seems that if 
> the author of 
> an XML document wants to exchange exact and complete RDF 
> representations 
> of the same resource, should they not simply use content negotiation 
> to serve a representation as RDF to begin with?

First, a custom XML serialization of RDF *is* RDF just as much as N3 is
RDF.  Second, there may not be any content negotation involved.  See the
pipelining example in issue-dbooth-9a:
http://lists.w3.org/Archives/Public/public-grddl-comments/2007AprJun/006
9.html
Finally (and more to your question), one reason the information may be
in a custom XML format is that the producer may be a legacy application
that only knows how to produce that format.  Another reason may be that
some consumers want to process the information in the XML format.  See
my paper on RDF and SOA for further explanation of this:
http://dbooth.org/2007/rdf-and-soa/rdf-and-soa-paper.htm

> 
> > (Note that the spec currently defines GRDDL results in relation to
> > information resources rather than XML instance documents (i.e.,
> > representations), and this is needed for namespace and 
> > profile URIs, but
> > it is not sufficient.  GRDDL results *also* need to be 
> > defined in terms
> > of XML instance documents (i.e., representations), because 
> > as pointed out in issue-dbooth-9a,
> > 
> > http://lists.w3.org/Archives/Public/public-grddl-comments/2007
AprJun/0069.html , it *always* makes sense to talk about the GRDDL 
> > results of an
> > XML instance document, but it does *not* always make sense 
> > to talk about
> > the GRDDL results of an information resource.)
> 
> Again, see the test-cases [3]. It does make sense to talk 
> aout the GRDDL 
> results of an information resource, as it may just be the 
> merge of GRDDL 
> results done for each representation the information resource serves.

No, as pointed out in issue-dbooth-9a
http://lists.w3.org/Archives/Public/public-grddl-comments/2007AprJun/006
9.html
it only makes sense to talk about the GRDDL results of *some*
information resources -- not all.  In particular, it only makes sense
when they are static information resources: 
[[
For example, suppose an information resource, ir,  produces a different
representation each time it is queried -- the current weather in Oaxaca,
for example -- and I have two XSLT scripts that I use to glean RDF from
them: one extracts the temperature (getTemperature.xsl) and the other
extracts the humidity (getHumidity.xsl).  The final RDF should be the
combined result of applying getTemperature.xsl and getHumidity.xsl to
the *same* representation.  But the spec does not define merged GRDDL
results for a particular representation, it only defines them for an
information resource as a whole, which could have a jumble of
temperatures and humidities from different days.
]]

However, it *always* makes sense to talk about the GRDDL results of a
representation.  

> 
> > Tellingly, I notice that the WG has routinely been using an implicit
> > concept of the complete GRDDL results (though not using 
> > this term) when
> > discussing and comparing test results, for example when two 
> > testers talk
> > about whether they got "the same" results for a particular 
> > test case.
> 
> Except in the test cases for multiple representations and multiple 
> infosets, which have been explicitly described and discussed 
> by the WG.
> The spec is not ambigous about what is acceptable, and 
> neither are the testcases. 

Yes, the spec is quite clear.  

> The spec simply says _multiple results_ may be 
> acceptable and are compatible with WebArch. 

Except in the case of namespace and profile documents, I am not
concerned about GRDDL results non-determinism that is caused by content
negotiation, because that is the document publisher's responsibility.  I
am mainly concerned about non-determinism that is due to
implementation-defined parsing behavior. 

> This may be unfortunate for some usecases, in 
> which case these usecases should not rely on the Web.

There is no need to throw the baby out with the bath.  If GRDDL results
are defined in terms of *representations* instead of only being defined
in terms of *information resources*, then one could sensibly talk with
clarity about the GRDDL results of a particular representation.  It
*always* makes sense to talk about the GRDDL results of a particular
representation; it only makes sense to talk about the GRDDL results of
an information resource if it is a *static* information resource.

> 
>  	I cannot honestly see how, given the indeterminancy of the XML 
> core specs regarding preprocessing and WebArch  content 
> negotiation (and 
> furthermore, that XSLT is Turing-complete and so  authors 
> could perversely 
> include random number generation [4], and so may  other programming 
> languages used by GRDDL transforms) 

I'm not concerned about that kind of indeterminacy, because that kind is
intentional.  I am concerned about unintended indeterminacy in the GRDDL
results.

> how we can mandate  all GRDDL transforms must be 
> complete without making GRDDL incompatible  with WebArch by 
> banning the 
> use of URIs and without GRDDL making decisions  that are in 
> the domain of the W3C XML Activity.

GRDDL transforms should not be required to produce "complete GRDDL
results".  Rather, the spec should define the notion of "complete GRDDL
results" so that it is clear what the complete GRDDL results should be
*if* they can be determined.  Briefly, the complete GRDDL results of an
XML instance document can be determined if:

 - the GRDDL results of that XML instance document do not depend on any
namespaces or profiles; or

 - for each namespace or profile URI, that URI is dereferenceable to a
representation that has complete GRDDL results.

Furthermore, even if complete GRDDL results are defined for a particular
XML instance document, a particular GRDDL processor may only be able to
compute partial GRDDL results, due to security or network access
limitations.

> 
> > Furthermore, the algorithm given in sec 7 of 
> > the GRDDL spec > http://www.w3.org/2004/01/rdxh/spec#sec_agt
> > describes most of the process needed to determine the complete GRDDL
> > results for a particular XML instance document, but:
> > - it does not define a conformance term for people to use;
> 
> The WG decided to only use conformance terms as regards 
> security. What 
> precise conformance term, with what precise definition, do 
> you want added?

"Complete GRDDL results", "partial GRDDL results" and "GRDDL processor",
all defined for an XML instance document rather than an information
resource.  I will be happy to help craft appropriate definitions, but it
is too late at night at the moment for me to start doing so, and I think
it is first important for the WG to understand the general idea, before
getting into the details of how to do it.  I've tried to give some
starting hints, but I realize these hints have been incomplete so far,
so I'll try to fill in as needed.

> 
> > - it is defined in terms of a URI as a starting point, 
> which introduces
> > much more ambiguity than being defined in terms of an XML instance
> > document as the starting point;
> 
> If we do not define a URI as a starting point, what would 
> have you have us
> use? 

The starting point should be a "representation" in the Web arch sense,
which is how I defined the term "XML instance document" in this
discussion.   

> It seems to be Webarch requires us to use URIs with 
> schemes such as 
> http and to cope with the possibility of conneg. 

The starting point only needs to be a URI in the case of a namespace or
profile URI, where GRDDL results need to be determined for it.  And that
case needs to be treated specially because a GRDDL processor needs to be
able to know that if it finds a representation for that URI dereference,
and that representation specifies a GRDDL transformation, then the GRDDL
results of *that* representation can be considered complete without
having to worry about the possible existence of some other
as-yet-undiscovered representation that may specify other GRDDL results.
This is why the additional sentence for the Faithful Renditions section
is needed.

> There is, however, 
> nothing preventing a client from retrieving a  particular 
> representation 
> and using the "file" scheme. However, to prevent GRDDL from 
> using http URIs would break WebArch.
> 
> > - it is intended for describing partial GRDDL results; and
> > - more needs to be nailed down to define the notion of 
> complete GRDDL
> > results.
> 
> Does the text describing "maximal" results not satisfy you? 

No, because: (a) it is defined for information resources rather than
representations; and (b) it currently permits parsing indeterminacy.

> [1]. If so, 
> can you clarify exactly how one can both use URIs and be 
> Webarch enabled wtih 
> content negotiation and have "complete" GRDDL results? As 
> usual, text that 
> you believe can be added or test-cases are appreciated.

Yes.  I've tried to sketch it out a bit, but I'll be happy to explain
more as needed.  

> 
> > Namespace and profile information URIs make it much more 
> > difficult to
> > define the notion of complete GRDDL results, because there is no
> > guarantee that the GRDDL processor is able to retrieve the correct
> > namespace or profile representation that specifies all of 
> > the intended
> > grddl:namespaceTransformations or 
> > grddl:profileTransformations that the
> > author intended should be applied. However, this difficulty can be
> > overcome by adding something to the Faithful Renditions 
> > section to the effect that:
> >
> >  "By specifying a GRDDL namespace transformation or profile
> >  transformation in a representation of a namespace or profile
> >  information resource, the creator of that namespace or
> >  profile states that every other representation of that same
> >  information resource that also specifies a GRDDL namespace
> >  transformation or profile transformation is functionally
> >  equivalent."
> 
> Again, with conneg and XML indeterminacy this cannot be guaranteed.

The XML indeterminacy needs to be eliminated, period.  But aside from
that, I agree that with conneg the above assurance cannot be guaranteed,
just as it cannot be guaranteed that GRDDL results will in fact
represent actual semantics of the input document.  That's why the above
sentence needs to be added to the Faithful Renditions section.  The
point of the Faithful Renditions section is to explicitly license users
to make the assumptions that the Faithful Renditions section details.

> 
> > If desired, I can describe in more detail how this can be done.
> 
> If you can specify exactly what XML preprocessing entails, 
> please do, and 
> respond in detail to my message[2].
> 
> > This approach will work when namespace and profile documents have
> > representations available that define GRDDL 
> > transformations.  But many
> > XML instance documents will need to make use of namespaces 
> > or profile
> > documents that will not have such representations 
> > available, and since
> > the dependency for defining complete GRDDL results is 
> > recursive through
> > all namespace and profile documents, it seems likely that 
> > in many cases
> > this approach will be infeasible.  Therefore, the GRDDL 
> > spec should also
> > define a short-cut mechanism to allow an XML instance document to
> > specify, for example, a grddl:completeTransformation attribute whose
> > presence would indicate that namespace and profile 
> > ocuments do *not*
> > need to be processed in order to determine the complete 
> > GRDDL results.
> 
> Yet one can never guarantee the namespace doc or profile doc will be 
> there. 

Correct.  If any is not there, and it is needed, then the complete GRDDL
results cannot be determined, but *partial* GRDDL results could be
determined.

> It seems like if cetain transforms are not wanted by 
> the author, they should not be specified. 

I am assuming that there may be other (non-GRDDL) reasons for specifying
namespace or profile URIs.

> The only way one could have complete GRDDL 
> results in this manner would be to guarantee the presence of 
> the complete 
> namespace and profile docs, which cannot be done. 

The complete GRDDL results would only be de

> How can you specify the 
> completeProfileTransformation will be accessible?

One cannot.  If it is not accessible, then the GRDDL processor cannot
determine the complete GRDDL results, but may be able to determine
partial GRDDL results.  

Oh, couple of key points I neglected to mention earlier: 

 - Partial GRDDL results are a subset of complete GRDDL results, but not
necessarily a proper subset.  

 - A GRDDL processor needs to be able to know whether or not it has
computed the complete GRDDL results.  I.e., it needs to be able to know
whether or not the results it has computed are certified as complete.

> 
> 
> > To cover xhtml document types that cannot contain
> > grddl:completeTransformation annotations directly, this 
> > approach *could*
> > also be extended by defining a grddl:completeProfileTransformation
> > property whose presence would have a similar effect of 
> > saying: "there is
> > no need to look at any other profile documents".  However 
> > it may be less
> > important to know the complete GRDDL results for xhtml 
> > documents than it
> > is for XML documents in general, so such an attribute may not be
> > necessary.
> >
> > POINT 4: The Faithful Rendition section is excellent for 
> > making clear
> > how the semantics of GRDDL results should be interpreted.  
> > However, I
> > will note that its intent is somewhat unclear, as it could 
> > mean either or both of:
> >
> > - The RDF results of a GRDDL transformation reflect 
> > real-life semantics
> > of the input XML instance document, however these semantics may be a
> > subset of the full semantics of that document.  (In 
> > essence, they are
> > whatever subset of the full semantics the GRDDL 
> > transformation author
> > has chosen to expose via GRDDL.)
> >
> > - GRDDL results for a given XML instance document may be ambiguous
> > (implementation defined), and it is the GRDDL 
> > transformation author's
> > responsibility to anticipate this ambiguity and ensure that 
> > the results
> > reflect real-life semantics of the input XML instance 
> > document anyway.
> 
> I believe it means both, and I cannot see how one can not include the 
> second intepretation without restricting the client, since 
> complete GRDDL 
> results may violate their local polcy, and without making 
> unreasonable 
> assumptions about the accessibility of namespace or profile docs and 
> banning use of conneg, and so, many URIs.

If local policy prevents access to any documents that are needed to
determine the complete GRDDL results, then that GRDDL processor will
only be able to determine partial GRDDL results.

> 
> > I like the first interpretation, and I consider that as a 
> > feature of the
> > spec.  I do not like the second  -- and I view it as a bug 
> > in the spec
> > -- because it merely foists the ambiguity problem off to the GRDDL
> > transformation author, and as I point out below, AFAICT it 
> > is not even
> > *possible* for the GRDDL transformation author to always write
> > transformations that produce correct, unambiguous results.
> >
> > POINT 5: In discussing the Faithful Rendition assurance, Section 6
> > explicitly says: "Therefore, it is suggested that GRDDL 
> > transformations
> > be written so that they perform all expected pre-processing 
> > . . . .".
> > But if the GRDDL transformation requires a particular sequence of
> > pre-processing, or it requires there to be *no* pre-processing, then
> > AFAICT it is not possible for the transformation author to 
> > control this
> > if pre-processing is explicitly permitted to be arbitrarily 
> > chosen by
> > the implementation before the GRDDL transformation ever 
> > sees the input.
> >
> > For example, suppose my schema includes blocks of XML code 
> > from other
> > documents, and I define a <myns:quote> tag to prevent the embedded
> > chunks of XML from being interpreted, and suppose that one of those
> > embedded chunks uses xinclude:
> >
> > <myns:myDoc . . . >
> >   <myns:quote>
> >      <otherNs:whatever>
> >         <xi:include href="http://example.org/do-not-expand" />
> >      </otherNx:whatever>
> >   <myns:quote>
> > </myns:myDoc>
> >
> > When this document is GRDDL transformed, the entire chunk 
> > of XML inside
> > the <myns:quote> element is supposed to become the value of an RDF
> > property *verbatim*, without expanding the xi:include 
> > directive.  If the
> > XML parser is permitted to expand or not expand the 
> > xi:include directive
> > at its discretion, before the GRDDL transformation even 
> > sees it, then it
> > is not possible for the GRDDL transformation author to ensure that
> > correct results will be produced.
> 
> Again, then do not use XInclude in your source document if 
> this is your desire, 

I do not think it would be reasonable to limit the domain of GRDDL that
way.

> or host the RDF you desire via conneg or some other means.

That is not always possible.  See the pipelining example in
issue-dbooth-9a:


> 
> > Again, please let me know how I can be most helpful in 
> resolving this
> > issue.
> 
> Again, by suggesting exact text and testcases. It seems to me 
> the best way 
> to address your concerns is to add a secion of informative 
> text for the 
> Spec to the faithful infoset section or to the test-cases 
> that recommends 
> that in order for GRDDL authors to best guarantee a faithful 
> rendition within their ability:
> 
> 1) Minimize XML preprocessing by not having the source document use 
> XInclude or schema validation.

The GRDDL transformation author may not have control over the schema of
the source document.

> 2) Have only one representation of the information resource 
> given by the 
> URI be available, and so not use content negotiation.

I am only really concerned about this in the case of namespace and
profile documents (as I explained) and to the extent that it muddies the
definition of "GRDDL results", because they are defined for information
resources instead of representations.

> 3) Restrict GRDDL transformations to deterministic finite 
> state automata. 

I'm not concerned about intentional indeterminism.

> 4) If an author wishes to guarantee that a XML document is 
> reflected by 
> some particular RDF document, that they author not use GRDDL 
> be serve RDF 
> directly and specify that using rel="alternate" in XHTML to 
> link to a RDF 
> document in the representation or serve it via  content 
> negotiation in 
> terms of XML docuemnts with URIs (Are there other ways for an 
> XML document to directly link to an RDF document?)

I'm mostly concerned about plain XML documents -- not xhtml.

> 
> Would this satisfy this comment? If not, please specify what 
> would satisfy 
> your comment, if possible without breaking WebArch by 
> disallowing conneg 
> and without forcing the GRDDL WG to develop its own XML 
> processing model.

No it would not.  I'll p

> 
> Again, by relying on a client-side processor some 
> indeterminancy must be 
> accepted by the server side authors. By relying on the Web 
> one also brings 
> indeterminancy into the equation.

Yes, but not all indeterminacy is the same.  I am not concerned about
cases where the GRDDL author or document publisher knowingly creates
indeterminacy, nor am I concerned about cases where the GRDDL processor
knowingly chooses to deviate from the GRDDL author's expressed intent
(due to security or other reasons).  I'm concerned about cases of
unintentional indeterminacy, where the GRDDL processor computes what it
legitimately believes to be the complete GRDDL results of a particular
XML instance document, but those results unintentionally differ from
what some other GRDDL processor computes.

> 
> I do think that if you want "XML preprocessing defined," 
> which you imply, 
> you should bring the issue up with the XML Activity, the XML 
> Processing 
> Model WG, and the TAG. Defining what "complete" XML preprocessing is 
> outside of the mandate of the GRDDL WG, and as a W3C WG weof 
> course must 
> attempt to abide by Web Arch and the current indeterminancy 
> in the XML 
> implementations and as created by the Web itself.
> 
> Guaranteed determinism is lost as soon you use accessing namespace or 
> profile docs on the Web, XML parsers, conneg-enabled URI schems and 
> Turing complete programming  languages. One can make 
> recommendations and 
> make this explicit, but I cannot see how one can change this. A GRDDL 
> client can at best try to apply all the available transformations it 
> understands and can access, and merge those results.

I hope I've begun to convince you differently, but I realize that my
answers have still been somewhat incomplete, so I'll continue to explain
more as needed.

> 
> [1] http://www.w3.org/TR/grddl-tests/
> [2] 
> http://lists.w3.org/Archives/Public/public-grddl-wg/2007May/0075.html
> [3] http://www.w3.org/TR/grddl-tests/#multiple-representations
> [4] 
> http://www.biglist.com/lists/xsl-list/archives/200105/msg00167.html

Thanks,

David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software
Received on Wednesday, 30 May 2007 06:27:22 UTC