Faithful Infoset (was RE: The xi namespace)

At 07:32 AM 12/21/2006 +0000, McBride, Brian wrote:

> > I think that the GRDDL spec should become clear that the
> > source document infoset must reflect any declared DTD or
> > schema, and must exercise XInclude. Anything less abrogates
> > the Faithful Rendition promise and is a failure.
>
>What exactly is "the faithful rendition" promise, as you see it.

First of all, the faithful rendition promise is Dan's expectation for GRDDL.

Until he mentioned this expectation to me, I had not expected that GRDDL
could/should be used in cases where the document creator intends that the
document in question is a legally binding contract. I initially thought that
getting reliable GRDDL results would be trying to win a like cupie doll --
"ya pays yer money; ya takes yer chances." Dan convinced me that it was
possible to guarantee a faithful rendition and that's when the "Faithful 
Rendition"
subsection was born.

My understanding of the faithful rendition promise is that the author of a
grddl:transformation or grddl:namespaceTransformation asserts that the
GRDDL result will provide, in the form of a graph, a well-defined subset
of facts extracted from a source document.

To help ensure a faithful rendition, run the transformation on a Faithful 
Infoset.
Faithful Infoset is a term Dan Chose to capture this issue in an evocative 
phrase.
It describes an infoset which is informed by DTD- or schema-validation and in
which <xi:include/> elements have been replaced with their transcluded content.

An unfaithful infoset can lead to an unfaithful rendition. It is altogether 
possible
that GRDDL result of both faithful and unfaithful infosets of the same document
will be the identical. Nonetheless, there will be instances in which the 
GRDDL result
of an unfaithful infoset will yield an unfaithful rendition. See the 
"XInclude or Not"
email thread.

I assert that the infoset intended by the author of an XML document 
includes the
DEFAULT and FIXED attributes declared in its DTD (if any) or its Schema (if 
any),
and the expanded form of <xi:include/> elements (if any). That is, although 
a given
serialization of a document might not contain such information directly, 
such information
is still part of that Information Resource, indirectly by reference to its 
DTD or Schema
and through transclusion of an XInclude target URI. And let's not even get 
into entities.

Dan has stated in the past that by dint of employing someone's namespace or 
profile,
the author of a document has subscribed to everything that comes with it, 
such as,
for example, any namespaceTransformation assertions that may be discoverable.

I assert that by dint of using XML with a DTD or Schema reference, the author
has subscribed to those specifications. Using an infoset that only takes 
into account
the bytes that are present in the document is unfaithful to the intent of 
the author.
Similarly with XInclude.

I guess what I am saying is that I believe that a GRDDL-aware processor has 
a duty
to resolve all XIncludes in the document and either XML- or Schema-validate it.

In spite of my beliefs in this regard, the WG would prefer not to mandate 
such processing.
I understand that it could be a burden to implement. So I am no longer 
trying to convince
the WG to adopt my position.

Even so, I think that it is incumbent on someone -- probably us/me -- to 
highlight a
heightened potential for unfaithful renditions in the face of DTDs, Schemas 
and XIncludes.
I suppose that XLink might also present a risk as well.

I will work on wording for a paragraph (hopefully that is all it will take).

If anybody doesn't understand my position, please let me know. I can live with
everybody disagreeing with how to handle the problem, but if you don't see how
it is a problem then I would appreciate your help in working toward a better
mutual understanding.

Regards,

Murray

P.S. This email is a lot longer than I had intended, but I don't have time 
to make it
any shorter.

Received on Thursday, 21 December 2006 23:41:24 UTC