W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > September 2009

[Bug 7703] HTML document conformance should explicitly depend on foreign content conformance

From: <bugzilla@wiggum.w3.org>
Date: Wed, 23 Sep 2009 01:46:28 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1MqGw8-0006eF-E1@wiggum.w3.org>

Ian 'Hixie' Hickson <ian@hixie.ch> changed:

           What    |Removed                     |Added
             Status|REOPENED                    |RESOLVED
         Resolution|                            |NEEDSINFO

--- Comment #5 from Ian 'Hixie' Hickson <ian@hixie.ch>  2009-09-23 01:46:27 ---
> I have the same issue with the conformance of the above not being well defined;
> this bug just happened to be using text/html syntax.

The conformance of both is well-defined.

> So HTML5 doesn't define a "valid HTML5 document" conformance class, but just
> provides a number of conformance assertions that should apply in any document? 

HTML5 defines two things of relevance: the conformance requirements for the
syntax of documents transmitted as text/html, and the conformance requirements
for the content of elements in the HTML namespace, regardless of whether they
are found (text/html, XML, DOM, something else).

> I didn't really think of it like that.  Can you point me to where in the spec
> it says this?

I'm not exactly sure what you want a pointer to. Could you elaborate?

> What you probably can guess I think is missing is an explicit hook to say that
> a text/html document represented by the string '<!DOCTYPE
> html><title></title><svg><yowsers/></svg>' is non-conforming.

The "Writing HTML documents" section:


...describes the syntax of text/html, which gives the the interpretation of
what the above string represents as a (DOM) tree. The HTML and SVG specs
together define whether that tree is conforming. 

(In reply to comment #3)
> If we take some arbitrary SVG XML document in the wild that includes a properly
> namespaced element and paste it into an HTML5 text/html document - I want to
> avoid needless conformance errors when running the HTML5 document through a
> validator.

There won't be any needless ones. There'll be several complaining that the
document you pasted doesn't mean what you think it means, though. These aren't
needless, they're actually quite important — we don't want, e.g, someone
writing a script that works with their SVG in text/html but suddenly fails when
they reserialize it to XML. It's bad enough that we're allowing
document.write() and so forth.

> Ideally, I want that <sodipodi:namedView> element to be ignored
> since it wasn't an error in the XML context.  The browser will ignore it.  I
> want the validator to also ignore it.  Is this possible?

Not without HTML parsing namespace declarations, and I really don't think we
want to go there.

> For SVG-in-HTML, things are a different story.  The SVG specification does not
> cover conformance of SVG-in-HTML markup really.

SVG-in-HTML is the same as SVG-in-dynamically-generated-DOM-trees, as far as
conformance goes. So unless SVG also doesn't define conformance for those, then
this is all well-defined. (If it doesn't define conformance for those, then
there are bigger problems than text/html.)

> What document says that an attribute with the qualified name "xlink:href"
> should be considered in the Xlink namespace? What about xlink:title and
> others?

In text/html, HTML5 hardcodes those strings to the right namespaces.

> Where should it be described that <Svg><Rect WIDTH=50 heighT=100 fill=red> is
> valid SVG-in-HTML?  Is that still in the HTML5 spec?

It's valid because the syntax results in a DOM tree that is isomorphic with a
DOM tree that would be considered a valid SVG fragment.

> Would it be crazy to say that conformance criteria of SVG-in-HTML should try to
> reconstitute unrecognized elements into their namespaces?

If you mean that HTML should support namespace declarations, then I don't think
it's crazy, but I don't think it's a good idea, either. It's not clear how it
would work in a backwards-compatible manner. It would also mean dragging the
whole rebindable-prefix mess into text/html, which I really think would do the
Web authoring community a big disservice.

> Or would it be crazy to say that unrecognized elements in SVG-in-HTML should
> just be ignored and considered neither conforming or nonconforming?

That's an SVG question, as far as I can tell. I don't think we want different
conformance criteria for DOMs that came from text/html vs DOMs that came from
XML or DOMs that were generated from script or from some other source.

(In reply to comment #4)
> One problem is that, after the text/html document is parsed, start tags with
> colons in their names result in elements with the string "U00003A" being
> created.

Not necessarily; the U00003A stuff is an optional conversion step for
architectures that don't support node names that aren't well-formed XML. I
wouldn't expect a Web browser to do that conversion step, for instance.

> Once parsed, if the validator is just looking at the DOM, it is
> impossible to determine whether the string "<dc:title>" was in the document or
> whether it was "<dcU00003Atitle>".

Assuming no script, and assuming the input is text/html, then it is impossible
to generate an element with the name "<dcU00003Atitle>" in the absence of this
conversion step.

> "Conforming SVG DOM Subtree That Came From Parsing text/html"

I strongly recommend against making conformance classes that mean that
reserializing a text/html document to XML would cause the document to stop
being conforming. (There's a few cases of this in text/html, but they're all
legacy that I'm trying to minimise the damage from.) I also recommend against
making it conforming to have elements in the DOM that aren't what the author
thinks they are, because that will do a disservice to the author. The point of
conformance is to catch that kind of mistake.

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Wednesday, 23 September 2009 01:46:37 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:01:01 UTC