- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Mon, 21 Sep 2009 09:39:42 +0100
- To: Manu Sporny <msporny@digitalbazaar.com>
- CC: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Manu Sporny wrote: > Two unrelated questions about XMLLiterals and stripping iquery/ifragment > from <base>. > > The first question is whether or not we should include > xmlns="http://www.w3.org/1999/xhtml" in XMLLiterals for HTML4 and HTML5. > [...] > Note that there is no default namespace specified for non-xml mode HTML5 > (AFAIK), so does it make sense to require the namespace in XMLLiterals? HTML5 requires that pages parsed as text/html always have their elements placed in the standard HTML namespace (http://whatwg.org/html5#insert-an-html-element) (ignoring SVG/MathML for now), regardless of any namespace declarations. HTML content like "<html><head>..." is parsed to an identical set of elements (in terms of Infoset-style namespace names and local names) as XML like "<html xmlns='http://www.w3.org/1999/xhtml'><head>...". > If we don't include it, do we violate the namespace well-formedness for > an XMLLiteral? I think we do, but thought I should check to see if I'm > missing something. "<sup>...</sup>" is still perfectly legal namespace well-formed XML, it's just different to the input (since the element is in no namespace, whereas in the input it was in the HTML namespace), and I believe XMLLiteral output really should be equivalent to the input (at a DOM/Infoset level). (It ought to be a consequence of the XML serialisation algorithm that namespace declarations are added to ensure the namespaces used by element/attribute names are correctly declared, regardless of the declarations in the input. You have to do that for XML too, because you're serialising a fragment that might use namespaces that were declared outside the fragment -- it's the same for HTML, except the names weren't explicitly declared anywhere in the document.) > The second question is what constitutes the base URL. If someone were to > specify the following: > > <base href="http://example.org/foo.xhtml?bar=baz#fnurt></base> > > Would the base URL be: http://example.org/foo.xhtml?bar=baz > or would it be: http://example.org/foo.xhtml I would hope it's the same as HTML5's notion of document base URL, most recently defined in http://www.w3.org/TR/2009/WD-html5-20090423/infrastructure.html#document-base-url (it's not in the latest draft since it's meant to be moved to another document but seemingly hasn't been yet). That seems to say it's the <base href> value resolved against the document's address, which I think (but I haven't checked this carefully) in this case will be the string "http://example.org/foo.xhtml?bar=baz#fnurt". (It will handle encodings and normalise some invalid syntax and some other bits, so it's not necessarily identical to the input string, but it retains all of the components.) It's possible that HTML5's notion of document base URL could change, e.g. I think it could simply drop the fragment part without breaking anything, and that might make it more readily reused by RDFa. If so, that feedback should be sent to the HTML WG or to whoever's working on [WEBADDRESSES] if it's going to be defined there. It seems best if HTML5 and RDFa can use a common definition, so that RDFa doesn't have to worry about redefining details like what happens if there's multiple <base href> elements in a document. (For the base URL used at a specific element, I would also hope it's the same as HTML5 uses when resolving URLs (http://www.w3.org/TR/2009/WD-html5-20090423/infrastructure.html#resolve-a-url), i.e. XML Base plus the document base URL as defined above.) -- Philip Taylor pjt47@cam.ac.uk
Received on Monday, 21 September 2009 08:40:19 UTC