Architectural Validation Only?

On Wed, 19 Jan 2000, W. Eliot Kimber wrote:

> [...] my point isn't about *validation*, it's about assertion of type
> membership. [...] It's not about validation--whether or not a document
> is validated is always the choice of the document receiver. It is
> inappropriate for a general-purpose standard to impose a validation
> policy on users of the standard. The standard must *enable*
> validation, but it cannot require it.

The really radical thought here is that a well-formed document, as a
unitary entity, need not validate at all!  It need only validate against
the requirements of its asserted types.  

The AFDR draws a very inportant distinction, IMHO, between 'encompassing'
and 'enabling' architectures:

  http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.1.html

In an open environment like the Web, an encompassing architecture makes
little if any sense.  A validation policy requiring it will have as much
success as King Canute had with the tide.  An enabling architecture, OTOH,
will be useful because it makes no demands outside of its own precisely
defined domain.  Whether an enabling architecture is viewed as extensible
or embeddable doesn't really matter for machine processing.  The important
thing is the ability to distinguish architectural from non-architectural
content.  

The basic requirement is the unambiguous determination of an architectural
"projection".  So, the practical problem for any individual document is
the markup needed to assert *how* it maps to some declared type.  Consider
the following well formed fragment:

   <foo>
     <bar>A bar of some kind</bar>
     <baz>The first baz</baz>
     <baz>The second baz</baz></foo>

I would like to map part of this structure - everything except the <bar>
element and its content - to this architectural template, drawn from a
declared type:

   <!ELEMENT  ul  (li+) >
   <!ELEMENT  li  (#PCDATA) >

yielding:

   <ul>
     <li>The first baz</li>
     <li>The second baz</li></ul>

So, I need to say the equivalent of the following:

   Element    Cognate   Content?
   =======    =======   ========
   foo        ul        Yes
   bar        [none]    No  -- i.e. dom't bother to look inside <bar>
   baz        li        Yes

Well, what if I had two attributes for this?  I tell the extractor: "use
the 'equiv' attribute for the element type, and the 'look' attribute to
decide whether to examine the content further or skip it":

   <foo equiv="ul" look="yes">
     <bar look="no">A bar of some kind</bar>
     <baz equiv="li" look="yes">The first baz</baz>
     <baz equiv="li" look="yes">The second baz</baz></foo>

That solves this particular mapping problem, provided I also have a way
to issue the details of the extraction instruction by appropriate markup.
That is, I need a declaration of some sort, say:

   <!MAP  "some-declared-type"
          element-map     "equiv"
          content-control "look" >

The particular *syntax* doesn't matter as much getting the semantics
across to the processor; and the point here is that these semantics of
extraction control are completely generic - they have nothing to do with
the meanings of either the original element types or the transformed
result.  The processor only needs to know the names of the element-mapper
and the content-controller.  Of course, two attributes may not be enough
in the general case, but that's just a matter of figuring out how many
"control axes" we do need and defining some analog of <!MAP ...>
accordingly.

The downside of this comprehesive mechanization can be seen from the
result of specifying look="yes" for <bar>:

   <ul>
     A bar of some kind
     <li>The first baz</li>
     <li>The second baz</li></ul>

which doesn't match the template that prompted the whole transformation
process to begin with.  *This* is where validation is important.  So, the
<!MAP...> formalism also needs to provide validation information - such as
the name/location of a DTD.

Summing up, the projection of a document onto a declared type doesn't need
anything more than a set of declared control attributes and a DTD to check
the result against.  There is no need for the original document to meet a
validation requirement.


Arjun  

Received on Monday, 24 January 2000 16:15:37 UTC