Re: Crimson 1.1.3 tests from David Brownell on 2001-11-16 (www-dom-ts@w3.org from November 2001)

From: David Brownell <david-b@pacbell.net>
Date: Thu, 15 Nov 2001 22:47:28 -0800
To: Curt Arnold <carnold@houston.rr.com>
Cc: Edwin Goei <edwingo@sun.com>, www-dom-ts@w3.org
Message-id: <043f01c16e6a$8fb24be0$6800000a@brownell.org>

The deal is that SAX clearly specifies that those IDs are supposed to
be resolved, so that's how Crimson does it.  (Xerces just as clearly
violates the SAX spec in this area.  See the "saxunit 0.2" tests now
available at http://xmlconf.sourceforge.net ...)  I suspect that if you had
included the GNUJAXP software you'd find it also resolves those
URIs.  (What test software were you running?  The "domunit 0.0.6"
test cases don't address such issues.)

If you want rationales, I'll share two beyond "it's allowed and
conforms with the specs".  First, the original expectation with XML
was that as "SGML on the web", all system IDs would be URLs.
And second, since DOM levels 1 and 2 don't expose information
about the base URLs of nodes, it'd seem that ever exposing relative
URLs through DOM would be buglike.  I know there were some
discussions early in the L2 timeframe about exposing those bases,
but that never happened.

Was there some rationale available for DOM implementations
that chose to expose relative URIs while hiding the base URIs
needed to interpret them?

FYI, there's now a SAX feature flag defined to control how those
IDs get handled:

    http://xml.org/sax/features/resolve-dtd-uris

See the listing of such flags 

    http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html

Some of those IDs support "SAX extensions 1.1", which I may put
out as a "beta" release soon.  (All the RFEs have been answered.)

- Dave

----- Original Message ----- 
From: "Curt Arnold" <carnold@houston.rr.com>
To: "David Brownell" <david-b@pacbell.net>
Cc: "Edwin Goei" <edwingo@sun.com>
Sent: Thursday, November 15, 2001 8:52 PM
Subject: Fw: Crimson 1.1.3 tests

> So Dave,
> 
> p.s. Thanks Edwin for the archive work, that was all I was asking
> 
> ----- Original Message -----
> From: "Edwin Goei" <edwingo@sun.com>
> To: "Curt Arnold" <carnold@houston.rr.com>
> Cc: <www-dom-ts@w3.org>
> Sent: Thursday, November 15, 2001 9:24 PM
> Subject: Re: Crimson 1.1.3 tests
> 
> 
> > Curt Arnold wrote:
> > >
> > > entitygetpublicid, notationgetsystemid
> > >
> > > These tests have been a subject of debate repeatedly on the www-dom-ts
> > > mailing list.  Crimson alone of the tested processors changes the
> > > relative
> > > URI's in the source document to absolute URI's when retrieved by
> > > Entity.getPublicId() and similar.  While not expressly prohibited by the
> > > DOM
> > > spec, there is nothing that would suggest to the user that it should be
> > > anticipated.
> > >
> > > Edwin, it would be interesting to get your take on whether the returning
> > > absolute URI's was intentional and any thought process behind it.
> >
> > I looked to see how crimson was resolving the URIs and it is in part of
> > the code that originated before I started working on the parser.  It's
> > in Parser2.java which was based from an older version of the code.  The
> > original author, Dave Brownell, might be able to help.
> >
> > > [snip]
> >
> > Not sure whether I was supposed to comment on the other parts of your
> > email.
> >
> > -Edwin
> >
>

Received on Friday, 16 November 2001 01:49:15 UTC