RE: HTML APIs

At 10:15 AM 1/4/99 , Miles Sabin wrote:
>Lauren Wood wrote,
>
>> In a schematic way of thinking, you can say that the
>> HTML document goes through the HTML parser, which adds
>> all the omitted tags, default attribute values, etc
>> and is turned into the structure model (that internal
>> representation of the document which the DOM
>> implementation uses). The DOM functions, methods, etc
>> then act on this structure model, not on the source
>> document. So by the time the DOM gets to the document,
>> all the missing pieces have already been filled in by
>> the parser. 
>
>That might be so for DOM implementations embedded in UAs
>(tho' I'm not even sure about that), but what about
>server side uses of the DOM where the structure model is
>built using the creation APIs,
>
>  public void foo(HTMLDocument doc)
>  {
>    // assume doc is newly created.
>
>    NodeList bodyList = doc.getElementsByTagName("BODY");
>    NodeList frameSetList = doc.getElementsByTagName("FRAMESET");
>
>    if(bodyList.getLength() == 0 && frameSetList.getLength() == 0)
>      doc.setBody(doc.createElement("BODY"));                       // 1
>    else if(frameSetList.getLength() == 1)
>      frameSetList.item(0).appendChild(doc.createElement("FRAME")); // 2
>    else if(bodyList.getLength() == 1)
>      bodyList.item(0).appendChild(doc.createElement("P"));         // 3
>  }
>
>As far as I can see from the REC, any of the following might
>legitimately be true,
>
>* (1) might never be executed, because a call on
>  HTMLDocument.getElementsByTagName() might trigger auto
>  insertion of HTML and BODY elements into an otherwise
>  empty document.
>
>* (2) might never be executed, for the same reason.
>
>* (3) might never be executed, because an implementation
>  might _not_ auto insert of HTML and BODY elements on a
>  call to HTMLDocument.getElementsByTagName().
>
>And I there a quite a few other possibilities too.
>
>Maybe these are slightly perverse examples, but I does
>look as tho' the spec needs a little tightening up here.

This problem doesn't seem quite so confusing if you assume
that a parser will do fixups (ie. adding the HTML and BODY
elements) but a DOM document generated from scratch will
only contain those nodes that have been explicitely inserted
programmatically. So if you don't explicitely insert an HTML
element your DOM tree will be missing it no matter what calls
you make. This puts the onus on the programmer and makes
it easier to predict what will happen.
This would make creating invalid HTML DOM trees possible
but I think its pretty hard (and unreasonable) for a DOM implementation
to completely insure a valid HTML document no matter what 
a progammer does. (At least for server-side applications, anyway).

I think that allowing the DOM implementation to do the fixups
automatically when, say, calling getElementsByTagName()
would complicate certain types of processing that depend on
a priori knowledge of DOM tree mutations.


I agree that the spec should clarify the issue.

Anyway, my 2 cents.

Regards,
- Claude Zervas
Uniplanet LLC

Received on Monday, 4 January 1999 15:44:10 UTC