RE: Comments on GRDDL draft [OK?] (#issue-faithful-infoset, XProc)

Hi Dan.  Thanks again for your explanations.   More below.

> From: Dan Connolly [mailto:connolly@w3.org] 
>
> On Mon, 2007-04-30 at 03:28 -0400, Booth, David (HP Software - Boston)
> wrote:
> [...]
> > > > 3. Are GRDDL transformations deterministic or not?
> 
> A short answer to this question is: yes, they are; a
> transformation is a function:
> 
> "... each GRDDL transformation specifies a transformation property, a
> function from XPath document nodes to ... RDF graphs."
>  -- http://www.w3.org/2004/01/rdxh/spec#txforms

Okay, so if I'm understanding correctly:

 - The GRDDL spec talks about XML documents but doesn't specify 
what infoset elaboration should be used in getting from an XML 
document to an XPath root node, thus the XPath root node (or 
infoset that it represents) may be ambiguous;

 - a transformation property is a function, but since the *input* 
of that function is ambiguous, the output may also be ambiguous;

 - a transformation property *can* be written to produce 
unambiguous output in the face of ambiguous input;

 - it is the transformation property author's responsibility 
to ensure that the transformation property produces unambiguous 
output if desired.

Is that correct?

> . . .
> >   For example, for the
> > simple, non-namespace case, instead of defining the
> > grddl:transformation attribute, how about allowing the
> > author to choose between three attributes:
> > 
> >   - grddl:transformation, which might have standard
> >   XML pipeline infoset semantics;
> 
> As I noted earlier, we tried to find such a standard
> and came to the conclusion that the state of the
> art offers no standard. Did we miss something?
> 
> >   - grddl:unprocessedTransformation, which might have
> >   semantics of NO infoset preprocessing; and
> > 
> >   - grddl:ambiguousTransformation, which might have the
> >   ambiguous semantics of the current GRDDL draft.

Actually, what I meant was: the GRDDL WG could somewhat
arbitrarily define a *particular* XML pipeline that would 
hopefully be usable by many applications, and use
grddl:transformation to indicate that that pipeline must
be used.  Those needing a different pipeline could instead use
grddl:unprocessedTransformation or
grddl:ambiguousTransformation.  However, this would
create a dependency on XProc.  I also don't know whether
the pipelines required by different apps are too diverse
for this approach to be feasible, i.e., whether there is
any pipeline that would cover 80% of apps.  (Perhaps
this is what you meant when you said the WG came to the
conclusion that the state of the art offers no standard?)

I guess my overall question here is how the WG intends
the output ambiguity to be addressed.  The spec:

 - notes the ambiguity in the input infoset; and

 - suggests "that GRDDL transformations be written so that 
they perform all expected pre-processing", thus eliminating
output ambiguity.

Why doesn't the spec just make the input infoset unambiguous 
by declaring that the input infoset does not have *any* 
pre-processing, instead of it being "implementation-defined"?  
After all, it seems reasonable to assume that:

 - the XML document author knows what pre-processing is needed; 
and

 - the GRDDL transformation author also knows what pre-processing 
is needed.

Furthermore, if it were unreasonable to assume that the input 
infoset had no pre-processing, then how could an XML document 
that *requires* the absence of pre-processing be reliably, 
correctly transformed by GRDDL?

The bottom line is that I think ambiguity is quite harmful, so 
I would like to understand the rationale that justifies it.

David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software 

Received on Wednesday, 2 May 2007 02:59:28 UTC