Re: Constant Inputs during Iteration from Erik Bruchez on 2006-10-12 (public-xml-processing-model-wg@w3.org from October 2006)

From: Erik Bruchez <ebruchez@orbeon.com>
Date: Thu, 12 Oct 2006 15:43:17 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <452E5485.3080501@orbeon.com>

 > I don't think we have to say when, though I'd like to say that a
 > URI, once dereferenced, it won't change. This is the same
 > restriction that XSLT places on input document()s.

This may make more sense for XSLT, a functional language (unless you
use certain extensions), than for XProc. After all, we say
"implementations must not assume that components are functional (that
is, that their outputs depend only on their explicit inputs and
parameters) or side-effect free." So I am not sure we have a very
strong requirement for enforcing invariance of external documents, at
least in something like p:for-each.

 > |> If your pipeline implementation determine that there is a static
 > |> reference to a document and pre-fetches that or some other kind of
 > |> optimization, then it will have the wrong version of the document.
 > |
 > | Yes. But note how a web browser performs such optimizations: it does
 > | so based on HTTP caching mechanisms.
 >
 > I don't think browser caching has anything to do with HTTP.

It has everything to do with HTTP, since browsers cache resources
loaded through HTTP according to semantics defined by HTTP. You know,
all those funny headers like Cache-Control, ETag, etc.? ;-)

http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13

But this was just an example of how an HTTP client actually deals with
HTTP caching. My point is mainly that HTTP has mechanisms able to tell
you whether a document as changed or not. If a server tells the XProc
implementation (when in the process of dereferencing a URI) that the
document expires in 2 days, the XProc implementation can cache the
document. If on the other hand the server tells the client that the
document expires immediately, it is not unreasonable for the client to
actually refetch the document next time it needs it (even if it's 10ms
later).

 > If we say that a URI must be cached locally (this is independent of
 > protocol and any protocol-based caching; I may be confused about
 > what sort of caching you were describing above), then it doesn't
 > matter when a URI is retrieved.

I was talking about HTTP caching. But something similar could apply to
"file:" (i.e. you can know when the file has changed).

 > If we don't say that it's cached, then I think we'll have to say
 > that it must not be cached. Having said that, in order to have any
 > sort of interoperability, I think we'll have to have a pretty
 > detailed story about execution order.

Another way would be to say that the document can be cached if you can
determine that it hasn't changed meanwhile. With HTTP, you determine
this through caching headers. With a local file, you determine this
with a "last modified date", etc.

So far I have exposed a use case that speaks in favor of allowing
implementations to re-dereference URIs (the "eXist update" use case),
but I haven't heard a very solid explanation of why requiring
invariance is a must.

-Erik

--

Orbeon - XForms Everywhere:
http://www.orbeon.com/blog/

Received on Thursday, 12 October 2006 14:44:03 UTC