Notations are useful

On Mon, 31 Jan 2000, Russell Steven Shawn O'Connor wrote:

> I agree that this would be ideal, but I have an itch telling me
> that the best solution somehow lies with the use of NOTATION.

Unfortunately, there seems to be a vicious circle between the lack of
knowledge regarding Notations and their neglect.  Since not many
people know what they're for, not many people are likely to use them,
and if they're not used, not many people are likely to find out what
problems they solve.  In line with this, the HTML DTDs don't use any,
and it seems they may never.

But, they're useful.  They allow us to mark the fact that content (be
it the data content of an element, or the value of an attribute) has
internal structure, only it isn't exposed to the formal process of
SGML/XML parsing.  The application to *extensible* validators (e.g.
the XML validator making call-outs to notation-specific validators)
should be obvious.  It also has a bearing on how much of the new
XSchema stuff should *necessarily* be part of XML validation per se.

Unfortunately, the XML 1.0 spec missed the boat on a significant
innovation in the WebSGML TC (besides, of course, severely limiting
the utility of Notations by disallowing data attributes.)  This is a
new category of declared value for attributes called DATA.  Its
purpose is to signify the fact that the attribute value is subject to
a notation (i.e. a structure or grammar or syntax, defined elsewhere)
For instance, something like this isn't really meaningful:

   <!ATTLIST ...
         href   CDATA    #IMPLIED
         ... >

and hiding the CDATA declared value in a suggestively named PE, like
%URI.datatype; - as in the new xHTML 1.1 DTD [1] - is really just all
handwaving too, since the *essential* information is inside a comment:

  <!-- a Uniform Resource Identifier, see [URI] -->
  <!ENTITY % URI.datatype "CDATA" >

[1] See xhtml11-datatypes-1.mod in the XHTML 1.1 DTD distribution

Instead, now we would be able to declare something like this:

  <!ATTLIST  ...
        href    DATA   Uri   #IMPLIED
        ...>

Where the 'Uri' - i.e. the name following the 'DATA' keyword' - is
required to be the name of a declared notation, e.g.:

  <!NOTATION  Uri "-//IETF RFC 2396//NOTATION
                    Uniform Resource Identifiers (URI): 
                    Generic Syntax//EN" >

Besides the benefit of obviating one level of PE mumbo-jumbo, I don't
think the value of this to generic processing could be disputed.
However, the lack of data attributes is still unfortunate, because
with them we can do much better.  Consider something like this:

  <!NOTATION Date "+//IDN Perl.org//NOTATION 
                    Perl Regular Expression Syntax//EN">
  <!ATTLIST #NOTATION Date
           match  CDATA  #FIXED "\d{4}/\d{2}/\d{2}" >
  ...
  <!ATTLIST  ...
            date  DATA  Date  #IMPLIED
            ...>

It's interesting to note that there is no provision for this facility
in the XSchema specs (where, if you want a new "datatype", you have to
go through an elaborate definition procedure - you can't simply
reference an external spec.)

Looks like a whole bunch of stuff will have to be reinvented...



Arjun

Received on Friday, 4 February 2000 05:47:50 UTC