Re: Comments on the architecture doc from Paul Prescod on 2002-02-04 (www-tag@w3.org from February 2002)

From: Paul Prescod <paul@prescod.net>
Date: Sun, 03 Feb 2002 21:49:54 -0500
To: Tim Berners-Lee <timbl@w3.org>
CC: Tim Bray <tbray@textuality.com>, www-tag@w3.org, cmsmcq@w3.org
Message-ID: <3C5DF6D2.E8CBFB64@prescod.net>
Tim Berners-Lee wrote:
> 
>...
> 
> > (2) Quote: "The namespace document (with the namespace URI) is
> >      a place for the language publisher to keep definitive material
> >      about a namespace. Schema languages are ideal for this."
> >      I disagree quite strongly.  Schema languages as they exist
> >      today represent bundles of declarative syntactic constraints.
> >      This is a small subset of "definitive material".
> 
> I agree completely.  I suppose I was using schema langauges
> in a rather stronger sense, in which I would include RDF-schema.,
> OWL, and RDDL.
> 
> In general, a machine-processable document about the language
> which can say things with whatever level we have learned to do so,
> and certainly which can reference other documents.

There is a pervasive myth that namespaces define languages and therefore
it makes sense to associate machine processable specifications with
them. Unfortunately this is not true in the general case. In general,
the only things that a set of elements with a common namespace have in
common are the common namespace.

C. M. Sperberg-McQueen says:

"Some readers may be unhappy with the idea that a namespace is a thing
of any kind at all, and thus with the notion that the rddl:resource
element asserts a relation in which the namespace plays a role. For
these nominalists, the argument may be recast without loss of force: The
RDDL resource element asserts a relation between (a) all the documents,
elements, attributes, values, or other data of any kind labeled with
names in a namespace and (b) the resource identified by its xlink:href
attribute."

This is a more precise way of speaking but unfortunately it undermines
the basic goals of RDDL.

It isn't realistic to think that a machine readable document associated
with a namespace could be reliably applied to documents that "happen to"
use a particular namespace. Even just staying in the world of W3C
specifications, one can see the same URI, localname pair being used in
contradictory ways depending on processing context (not element
context).

Consider, let's say, a CSS stylesheet (a very simple form of processing
specification) associated with the namespace XHTML. Now consider the
following document (which I assert is XSLT, not XHTML) and how reliably
the CSS would apply to the data:

<html:html>
  <xsl:when>
     <html:title>.....</html:title>
     <html:body> ...
       <html:a>
           <xsl:attribute .../>
       </html:a>
     </html:body>
  </xsl:when>
  <xsl:otherwise>
     <html:title>.....</html:title>
     <html:body> ...
     </html:body>
  </xsl:otherwise>
</html:html>

Would CSS's inheritance rules be properly applied here? Could CSS
reliably detect hyperlinks to annotate them as the stylesheet asks? How
will it handle paths of the form html>body>h1 etc. It will annotate the
wrong nodes with particular styles. In this context, html:body means "a
literal result element...semantics undefined".

Now consider how horribly wrong you could go with something really
complicated like a stylesheet or a Java class. They are going to
positively choke on this data.

Next example:

<soap:Envelope>
  <soap:Header>
    <somehtml>
     <html:p id="para"><html:a name="target">Foo</html:a></html:p>
     <html:body><soaphref href="para"/></html:body>
    </somehtml>
  </soap:Header>
</soap:Envelope>

Now from a SOAP standpoint, this is an html:p within an html:body but of
course the CSS stylesheet won't know that. So applying it to this
document will annotate the node with the wrong style. Again, in this
context, the html:p is a *literal*, not a paragraph.

I could produce examples like this all day. The most extreme example is
just:

<foo:bar>
  <html:....>

  </html:...>
</foo:bar>

For all you know, foo:bar means: "please ignore everything in here." So
the right processing specification is none at all. This is completely
legal according to all existing W3C specifications and (AFAIK)
guidelines.

The point is that seeing a particular namespace on an element tells
software *nothing* about how to interpret that element unless it
completely understands the context (not just the XML context, but the
processing context). If we presume, as XSLT does, that elements can
inner elements affect the interpretation of container elements, then my
foo:bar could have been deeply nested and totally reversed the correct
interpration of the whole document.

What you need to do to process these documents is apply the processing
envisioned by XSLT or SOAP (the specifications, not the namespaces), and
hope that the *output* is an infoset (or document) that conforms to the
XHTML specification. *Then* you can reliably apply the CSS.

So the CSS is really, truly associated with the XHTML document type (the
set of all documents/infosets that conform to the XHTML specification),
not with the XHTML namespace. The namespace is potentially useful for
recognizing which elements should end up in that document/infoset and
perhaps to figure out whch ones should be recursively processed (e.g.
SVG) by extracting *them* into a document/infoset.

This all brings us back to the prehistoric notion that documents have
types (just as web resources have MIME types) and that processing
applies first and foremost, to document types, not to element types. The
element types are interpreted in the context of their document types.

 Paul Prescod
Received on Sunday, 3 February 2002 21:52:28 UTC