Re: Comment about DocumentType

I posted the following on the DOM IG's mailing list a while back. I think
it covers most of the things that the DocumentType has to tell us. I've
tried to generalize from DTDs so it can also support DCD, and maybe other
schema proposals.  I haven't yet compared it to the draft that Arnaud has
made available, and it may be missing needed features; if you see a
mistake, please say so!

DocumentType has to be able to tell us:

a) The type(s) of the acceptable Root element(s). Some of the schema
designs allow more than one valid choice (to avoid the need for a
functionless wrapper element).

b) The type(s) and sequence(s) of children that each kind of element will
permit. To insulate us from possible variation in how these rules are
expressed in the schema langauges, I'd suggest making this a query that
asks "If I was to do  nodeX.insertBefore(nodeY,nodeZ), where X and Z are
known, what types of nodes would be permitted as Y?" Note that I'm only
suggesting checking the structure upward and backward from this point, not
the following nodes, since a legal may require insertion of several nodes
before Z rather than just one.

c) The type(s) of attributes that each type of element will permit/require,
together with acceptable values and default and so on. Note that in some
schema languages, this depends on the context in which the element appears,
so we may have to pass in a specific element, in context, rather than just
an element's nodename.

d) Whether a node's contents are valid, given the rules used above. This
could be more tightly optimized than checking the single-node calls
repeatedly. This is the real validation test; the preceeding two are
primarily provided for directed editing. The application would decide
whether it wants to check each node as it's built or defer that to a
go/no-go test on larger chunks of the document; this may mean the validate
call should be able to operate either as a shallow check (if the tree is
being checked bottom-up) or as a deep tree-walk (if an entire subtree is to
be validated at once).

Failure might return the first node that isn't acceptable where it stands.
That isn't necessarily the point of error, but it's probably the best we
can do to help the user find the problem.

e) Entity and Notation information is already covered by the Level 1 API.
I'm not wild about the way in which they're handled -- I'd rather have a
get-by-name query against the DocumentType itself rather than retrieving
the NamedNodeMap and then querying that, especially since some schema
proposals may have to gather data from several document descriptions -- but
it's workable. There may be some details that have to be filled in; I
haven't used either Entities or Notations enough to be sure about that.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.

Received on Thursday, 22 October 1998 09:30:51 UTC