- From: Norman Walsh <Norman.Walsh@Sun.COM>
- Date: Thu, 01 Feb 2007 10:07:18 -0800
- To: public-xml-processing-model-wg@w3.org
- Message-ID: <87fy9pu4yx.fsf@nwalsh.com>
We've had a couple of proposals in the component thread that amount to
allowing non-XML documents to flow through the pipeline in some fashion.
That looks like a slippery slope to me. With sharp spikes at the bottom.
But if we're going to entertain it, I think we should consider it
generally and not in isolation around one or two components.
First off, can we agree that we're talking about things like text/html
or text/plain or image/jpeg that are manifestly not XML. If a
component is supposed to generate XML but sends mis-matched start and
end tag events, the processor is required to consider that an error.
The simplest answer to the question, "how do I process text/html with
XProc?" is: you don't. Implementors can provide extension components
that do anything they want, but the standard components like load
simply produce errors.
Another answer, I think, is that components can produce some sort of
quoting element (I forget what name Alex proposed) like
<p:quoted-content type="text/html">
...
</p:quoted-content>
If we adopted this, I think I'd want some sort of user option to
enable it.
The last answer I can think of is that we could try to tidy/tagsoup.
I suppose, if we can't agree on the simplest answer, I'm inclined to
say we do the quoted conent thing and have a standard component that
takes a quoted content thing and attempts (through an implementation
defined mechanism) to turn it into well-formed XML.
Simpler is better though, I think.
Be seeing you,
norm
--
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Thursday, 1 February 2007 18:07:30 UTC