Re: New TAG issue: TagSoupIntegration-54 from Bjoern Hoehrmann on 2006-11-02 (www-tag@w3.org from November 2006)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Thu, 02 Nov 2006 08:49:39 +0100
To: Elliotte Harold <elharo@metalab.unc.edu>
Cc: www-tag@w3.org
Message-ID: <t43jk29f83hoouevgm8j2frouurdru43gg@hive.bjoern.hoehrmann.de>

* Elliotte Harold wrote:
>The W3C define a process by which *any* arbitrary byte stream can be 
>converted into valid XHTML, no matter what. This would act as a filter 
>on incoming data. Browsers in normal operation would be expected to 
>apply this filter before rendering a document, constructing a DOM, or 
>doing pretty much anything else with a page that purports to be HTML. 

What you are asking is infeasible. As an example, the byte stream might
include markup like <img src="example" />. In order to become "valid",
an alt="..." attribute with alternate text for the image is required.
There exists no known algorithm that generates acceptable results in
doing so that can be implemented on constrained devices. Your proposed
"proof" does not even attempt to implement such an algorithm, so there
exsists no proof, and regardless, any such proof would also proof that
such a process would be incompatible with a broad range of web sites.

You would have to reduce your request to, say, transforming arbitrary
character strings into a DOM Document that could also be created using
only DOM APIs, as defined in DOM Level 3 Core, and even that would still
be very far away from, say, what the draft "HTML5" parsing algorithm
delivers. Contrary to your request, some don't even desire this process
to be fully deterministic.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Thursday, 2 November 2006 07:49:52 UTC