Re: Notes on DOM spec

I've also been trying to implement a DOM compliant XML parser, and have had
to add a reasonable amount to get the functionality that I want.  Here are
my thoughts...

At 22:06 +0000 3/3/98, Andrew n marshall wrote:
>The following are a list of notes I've been keeping as I have been working
>implementing a DOM compliant XML parser.  Some are questions, Some are issues
>with the Java interfaces.
>
>-----------------------------------------------
>Notes on Core DOM:
[snip questions I have nothing to say on]
> * What is the value of an Attribute such as "declare" in the following HTML
>OBJECT tag:
>     <P><OBJECT declare
>             id="earth.declaration"
>             data="TheEarth.mpeg"
>             type="application/mpeg">
>        The <STRONG>Earth</STRONG> as seen from space.
>     </OBJECT>

From http://www.w3.org/TR/REC-html40/intro/sgmltut.html#h-3.3.4.2:

----

Boolean attributes
------------------

Some attributes play the role of boolean variables (e.g., the selected
attribute for the OPTION element).  Their appearance in the start tag of an
element implies that the value of the attribute is "true".  Their absence
implies a value of "false".

Boolean attributes may legally take a single value: the name of the
attribute itself (e.g., selected="selected").

This example defines the selected attribute to be a boolean attribute.

selected	(selected)	#IMPLIED	-- reduced inter-item
spacing --

The attribute is set to "true" by appearing in the element's start tag:

<OPTION selected="selected">
...contents...
<OPTION>

In HTML, boolean attributes may be [sic] appear in minimized form -- the
attribute's value appears alone in the element's start tag.  Thus, selected
may be set by writing:

<OPTION selected>

instead of:

<OPTION selected="selected">

Authors should be aware than [sic] many user agents only recognize the
minimized form of boolean attributes and not the full form.

----

From this, I would say that the attribute declare should take the value
"declare".  (Obviously if a boolean attribute is minimized, the HTML is not
well-formed XML.)

> * The description of Text.data says, "Text nodes contain just plain text,
>without markup and without entities".  Does this mean Text should not include
>translated character entities, such as '&' where "&amp;" was?  If this is
>true, how are these represented in the DOM?

I've made up interfaces for entity references and character references
(which I haven't got with me at the moment, but can post if desired) and
converted the IDL interfaces for Entities.  As Nodes, entity and character
references can fit in anywhere Text nodes can.

<aside>
The 'only' problem (and this is my big current problem) is how to deal with
representing internal parameter entity references within external DTDs.
For example:

<!ENTITY % standardatt "CDATA #IMPLIED">
...
<!ATTLIST foo
  bar %standardatt;
>

Internal parameter entities are particularly sticky as their content might
mean different things under different circumstances e.g.

<!ENTITY % tagname "MYTAG">
<!ELEMENT %tagname; ANY>
<!ATTLIST %tagname;
  %tagname (%tagname;) #IMPLIED
>
<!ELEMENT foo (#PCDATA|%tagname;)*>
<!ENTITY bar "%tagname; is the top element!">

But my main problem is that (for an editor) I want to maintain the physical
representation as well as the logical representation.  Perhaps that's
outside the scope of the DOM, at least at this stage, but if anyone has any
suggestions, I'd be very keen to hear them.
</aside>

[snip]
>-----------------------------------------------
>Notes on the Java Interface Definition:
>
[snippety snip]
> * Node.NodeType constants need to be declared static

Same goes for the constants in the following classes:
 OccurrenceType
 ElementDefinition.ContentType
 ModelGroup.ConnectionType
 AttributeDefinition.DeclaredValueType
 AttributeDefinition.DefaultValueType

Cheers,

Jeni


Jenifer Tennison
Department of Psychology, University of Nottingham
University Park, Nottingham NG7 2RD, UK
tel: +44 (0) 115 951 5151 x8352
fax: +44 (0) 115 951 5324
url: http://www.psychology.nottingham.ac.uk/staff/Jenifer.Tennison/

Received on Wednesday, 4 March 1998 07:04:08 UTC