- From: Daniel O'Connor <daniel.oconnor@gmail.com>
- Date: Sat, 15 Mar 2008 21:59:30 +1030
- To: "Rob Richards" <rrichards@php.net>, xml@gnome.org, public-grddl-comments@w3.org
Hey all, I'm causing trouble for Richard by asking for things like: http://bugs.php.net/bug.php?id=44367 Basically, what happens in the following scenarios with the baseURI of a document? 1. A document is loaded from a URI (http://foo.com/) 2. An xhtml document is loaded from a URI (http://foo.com/), but has a <base href="http://bar.com/" /> 3. An xml document is loaded from a URI, but has an <Foo xml:base="http://bar.com/" /> 4. An xml document is loaded from a URI, which was redirected (GET http://foo.com/ redirected to http://bar.com/) 5. An xml document is loaded, and has an xml:base attribute - but it's not on the root element (/Foo/bar[@xml:base]) >From what I read of http://www.faqs.org/rfcs/rfc2396.html, section 5.1 & on, I think it should be: 1. http://foo.com/ 2. http://foo.com/ 2a. Unless the implementation understands xhtml / html - http://bar.com/ 3. http://bar.com/ 4. http://bar.com/ 5. http://foo.com/ The current behavior for PHP (using libxml2 2.6.31) isn't that. Additionally, there are a number of GRDDL (a W3C TR) tests which explicitly expose these kinds of behaviour - and the expected test results marry up to the behaviour outlined above. See also: http://www.w3.org/TR/grddl-tests/#htmlbase1 On Thu, Mar 13, 2008 at 11:32 PM, Rob Richards <rrichards@php.net> wrote: > Hi Daniel, > > I'm taking this off the PHP bug system as if it were to be a bug (still > say its not) it would end up being a libxml2 bug and need to be taken up > there. > > Anyways, back to the issue at hand. > > When you went through the points in xml:base, you forgot the piece: > "The base URI of a document entity or an external entity is determined > by RFC 2396 rules, namely, that the base URI is the URI used to > retrieve the document entity or external entity. > > GRDDL might be attempting to address xml:base issues, however, any > resolutions it comes up with pertains to GRDDL and any specs referencing > GRDDL. DOM is generic and based on its specs, follows the XML Infoset, > XML Base and RFC 2396 specs to determine base uri. Using document > content to determine a base uri is dependant upon the media type. For > instance, text/html media type (HTML) can use a BASE element to > determine base uri. DOM strictly works with either XHTML or XML > (excluding HTML here to be able to talk about this generally). The XML > specs themselves, do not specify any such special way to determine base > uri, hence it resorts to the quote I mentioned above. > > Now, if you are working with GRDDL, the base uri of a document may > indeed be dependant upon the document element. So, when writing > applications to work with GRDDL, you must use the base uri property of > the document element. This still does not change the fact that you are > still are using DOM, so the base uri of the DOMDocument is dependant > upon the specs DOM uses to determine base uri. > > After all this, I still could be completely wrong, though do not believe > that so. If you would like to continue this discussion, I would suggest > you CC the libxml2 dev list (xml@gnome.org) as that would provide much > more input on the subject. You might also want to check with the Xerces > devs as well (another parser I tend to use for comparison) as they are > probably much more responsive to questions than Microsoft :) > > Rob > -- Looking for a new php job? See what you can do with https://vx.valex.com.au/tests/season/
Received on Saturday, 15 March 2008 11:30:05 UTC