Re: DOM Base URI (xml:base, RFC 2396)

Hey all,
I'm causing trouble for Richard by asking for things like:
http://bugs.php.net/bug.php?id=44367

Basically, what happens in the following scenarios with the baseURI of
a document?

 1. A document is loaded from a URI (http://foo.com/)
 2. An xhtml document is loaded from a URI  (http://foo.com/), but has
a <base href="http://bar.com/" />
 3. An xml document is loaded from a URI, but has an <Foo
xml:base="http://bar.com/" />
 4. An xml document is loaded from a URI, which was redirected (GET
http://foo.com/ redirected to http://bar.com/)
 5. An xml document is loaded, and has an xml:base attribute - but
it's not on the root element (/Foo/bar[@xml:base])


>From what I read of http://www.faqs.org/rfcs/rfc2396.html, section 5.1
& on, I think it should be:

1. http://foo.com/
2. http://foo.com/
2a. Unless the implementation understands xhtml / html - http://bar.com/
3. http://bar.com/
4. http://bar.com/
5. http://foo.com/


The current behavior for PHP (using libxml2 2.6.31) isn't that.


Additionally, there are a number of GRDDL (a W3C TR) tests which
explicitly expose these kinds of behaviour - and the expected test
results marry up to the behaviour outlined above.

See also: http://www.w3.org/TR/grddl-tests/#htmlbase1


On Thu, Mar 13, 2008 at 11:32 PM, Rob Richards <rrichards@php.net> wrote:
> Hi Daniel,
>
>  I'm taking this off the PHP bug system as if it were to be a bug (still
>  say its not) it would end up being a libxml2 bug and need to be taken up
>  there.
>
>  Anyways,  back to the issue at hand.
>
>  When you went through the points in xml:base, you forgot the piece:
>  "The base URI of a document entity or an external entity is determined
>  by RFC 2396 rules, namely, that the base URI is the URI used to
>  retrieve the document entity or external entity.
>
>  GRDDL might be attempting to address xml:base issues, however, any
>  resolutions it comes up with pertains to GRDDL and any specs referencing
>  GRDDL. DOM is generic and based on its specs, follows the XML Infoset,
>  XML Base and RFC 2396 specs to determine base uri. Using document
>  content to determine a base uri is dependant upon the media type. For
>  instance, text/html media type (HTML) can use a BASE element to
>  determine base uri. DOM strictly works with either XHTML or XML
>  (excluding HTML here to be able to talk about this generally). The XML
>  specs themselves, do not specify any such special way to determine  base
>  uri, hence it resorts to the quote I mentioned above.
>
>  Now, if you are working with GRDDL, the base uri of a document may
>  indeed be dependant upon the document element. So, when writing
>  applications to work with GRDDL, you must use the base uri property of
>  the document element. This still does not change the fact that you are
>  still are using DOM, so the base uri of the DOMDocument is dependant
>  upon the specs DOM uses to determine base uri.
>
>  After all this, I still could be completely wrong, though do not believe
>  that so. If you would like to continue this discussion, I would suggest
>  you CC the libxml2 dev list (xml@gnome.org) as that would provide much
>  more input on the subject. You might also want to check with the Xerces
>  devs as well (another parser I tend to use for comparison) as they are
>  probably much more responsive to questions than Microsoft :)
>
>  Rob
>



-- 
Looking for a new php job? See what you can do with
https://vx.valex.com.au/tests/season/

Received on Saturday, 15 March 2008 11:30:05 UTC