Re: Functional Request for DOM from Ian B. Jacobs on 2002-04-04 (w3c-wai-ua@w3.org from April to June 2002)

From: Ian B. Jacobs <ij@w3.org>
Date: Thu, 04 Apr 2002 17:18:54 -0500
To: Jill Thomas <jill@ionsystems.com>
CC: "'w3c-wai-ua@w3.org'" <w3c-wai-ua@w3.org>
Message-ID: <3CACD14E.70201@w3.org>
Jill,

Thanks for sending this. I have just a couple of questions below,
preceded by IJ:. I will be putting your request into a table of
requested functionalities, that I hope to make available soon.

  - Ian

Jill Thomas wrote:

> As discussed in the teleconference on March 28th I am enclosing a list of the
> functional requirements ION Systems has using the DOM:
> 
> 1. A guarantee that the DOM, once obtained, is valid. Using certain user agents,
> a node may
> actually be the child of two parents. This is the result of invalid HTML, but
> from our perspective that makes no difference as it is impossible to work with
> the invalid DOM.
> If a DOM is invalid as a result of incorrect input, then we need a way to
> discover the errors so that the user can be presented a warning. If this were
> implemented, HTML authors might actually be informed that their sites were
> invalid and might have some incentive to fix them.
> 

I'd like to get a better sense of what is meant by a "valid
DOM". Since the DOM is an API, does this mean "conforming
DOM implementation"? Or does this mean "conforming mapping
from source document to document tree"? My comments below
suppose the latter.

Suppose we have a source document D. If D is not well-formed
or does not validate, UA behavior for building an internal
tree is not defined in any spec I'm aware of. 

If D is well-formed and validates, then the XML Infoset 
(in section 2.2 [1]) ensures that information about 
elements (for example) is made available in document order.

[1] http://www.w3.org/TR/xml-infoset/#infoitem.element

DOM 2 Core does not have a requirement that the order
of information respect the source document order, but it's
unlikely that any DOM implementation would consciously present
available information in order other than document order.

This means that when the internal tree (conveyed through
the DOM or other API) does not match the source tree, it's
either because 

  a) the source tree was broken to begin with, or
  b) the implementation (that converts source to internal
     tree) is broken with respect to conformance to the Infoset
     and everyone's expectations about the DOM.

For the case you cite (one node has two parents), do you know
whether the cause is bad source or bad implementation?

Whatever the cause, I think that the most that UAAG 1.0 will
be able to do is require conformance to the Infoset requirement
to respect document order. There's nothing that UAAG 1.0 can
do to explain how to repair broken content.

There may be one extra thing UAAG 1.0 could do: require that
the form of the document be available at least in canonical
form, per the XML Canonicalization Recommendation [2]. That
may be going a bit far, but maybe not.


[2] http://www.w3.org/TR/xml-c14n

> 2. A consistent method for determining DOM completion. It is possible, using
> certain user
> agents, to obtain a pointer to a DOM that is not complete. It would be nice if
> we could reliably determine partial completion so that the entire DOM would not
> have to be constructed before our software started.

Please clarify where this refers to:

 a) A requirement related to modifications to the document tree
    by the user (e.g., filling out a form). [I don't understand
    what this would mean.]
 
 b) A requirement that the UA notify the AT that either the 
    document object is still being constructed, or construction
    has been completed.


> As an aside here is what we don't need:
> A tree structure. The fact that DOM is a tree might be a significant limitation
> to its architecture. Users do not see the document as a tree, they see it as a
> list. Having the DOM as a tree makes creation of a valid object from invalid
> input more difficult  and also makes it harder to render. This may be an off the
> wall suggestion, but included it anyway because our primary problem with DOM is
> getting one that is broken, and we don't see the world of HTML suddenly changing
> and presenting us with nothing but valid documents. A simpler design that was
> correct more often would be better.

I'm not sure what you mean by "you don't need a tree". If you are dealing
with xml and html content, the whole design is based on trees (even if the
rendering is not). I don't know how you can get away from a tree structure
(conveyed through the DOM). I'm sorry if I've misunderstood something.


-- 
Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
Tel:                     +1 718 260-9447
Received on Thursday, 4 April 2002 17:19:25 UTC