Re: L3 LS: Configuration parameter to enable/disable application specific DOM from Curt Arnold on 2004-01-04 (www-dom@w3.org from January to March 2004)

From: Curt Arnold <carnold@houston.rr.com>
Date: Sat, 03 Jan 2004 21:05:55 -0600
To: www-dom@w3.org
Message-ID: <3FF78313.6090701@houston.rr.com>
>You want a configuration switch that tells the parser to behave as an
>"application-aware" parser vs. a generic parser. If set to
>"application-aware," and it encounters an invalid element or attribute,
>it would presumably fail in some way, or attempt some kind of recovery,
>etc. If set to "generic," it would succeed as long as the document is
>well-formed (you said "valid" in your message, but I think you meant
>"well-formed," no?), but the downside is that the final result is a
>generic DOM object tree, devoid of any language-specific semantics or
>functionality.
>  
>
Sorry, I did mean "well-formed".

>Is that a reasonable elaboration?
>
>In my own work, I've put together a system that instantiates elements
>and attributes on a purely namespace+scoping basis. "Smart" elements are
>instantiated based on their qualified names, and "smart" attributes are
>instantiated based on the combination of their qualified names plus the
>qualified names of the elements that own them (this is necessary for
>SVG, where the same name is used for attributes having different
>semantics when contained within different elements).
>
>If a qualified name is not recognized by the parser, it "falls through"
>and a generic DOM element or attribute object is instantiated. So, in a
>limited sense at least, the parse always succeeds, in that the parser
>never fails because of invalidity. However, it's possible that an
>attempt to insert the wrong kind of content or attribute into an element
>might cause an exception to be raised. For example, assuming my parser
>understands both SVG and XHTML:
>
> <!--namespace declarations omitted for brevity-->
> <svg>
>   <html/>
> </svg>
>
>Here, the <html> element will be instantiated successfully, but an
>exception will be raised upon the attempt to insert it as a child of the
><svg> element. Note that the exception is not raised by the parser, but
>by the "smart" <svg> element itself, which knows that it shouldn't have
>an <html> element as a child.
>  
>

Actually, an SVG implementation is required to ignore all 
foreign-namespace elements, so <svg><html/></svg> would be valid SVG, 
but <html><svg/></html> would not be valid XHTML.

>Anyway, back to your question: I'm not sure that I would consider this
>to be as much a parser configuration setting as an application
>configuration setting. At least part of my rationale for this is because
>of the way my parser works, as described above--it's left up to the
>"smart" elements themselves, rather than the parser, to decide what's
>right and what's not.
>
>  
>

When populating a DOM tree by loading a document using DOM L&S, the 
application has no hook to control the type of nodes created during parse.

Consider this scenario, I have a script within an XHTML document that is 
loaded into an XHTML browser.  The DOM implementation of the currently 
rendered page may have a precondition of semantically valid XHTML, 
otherwise the page would not be rendered, and exposes the HTML DOM.  A 
script on that page attempts to load another XHTML document using L&S.  
The parser has no way of knowing if the XHTML document is semantically 
valid XHTML and I'd like to have access to the HTML DOM interfaces or if 
the application is an XHTML editor and tolerates semantically invalid, 
but well-formed, XHTML and does not expect the HTML DOM interfaces.  The 
fact that my host document meets the preconditions for and exposes the 
HTML DOM interfaces, does not indicate that any document that I attempt 
to load will meet the same preconditions.

I assume the specification would consist of DOM feature names that you 
request the resulting DOM tree to expose.  Something along the lines of:

parser.configuration.setParameter("dom-features", "SVG 1.1");
Received on Saturday, 3 January 2004 22:05:52 UTC