- From: Alex Milowski <alex@milowski.org>
- Date: Tue, 8 May 2007 07:49:07 -0700
- To: public-xml-processing-model-wg@w3.org
- Message-ID: <28d56ece0705080749yf8f945bma8848f20ce2ea4b8@mail.gmail.com>
On 5/7/07, Norman Walsh <ndw@nwalsh.com> wrote: > > Not that I want to sound obsessed or anything, but given that the > motivation for escaping markup in formats like RSS is that it can't > be relied upon to be well-formed, what's the point of unescaping > it in XProc? It'll immediately cause the pipeline to crash. > > Should we have a "force-markup-to-be-well-formed" option or something? I've used unescaping of RSS descriptions to process random RSS feeds into XHTML representations. For example, if you want to run XSLT on an RSS feed you need to pre-process it with an unescape-markup step. Now, to ensure that pipeline doesn't fail, you wrap the unescape-markup step with a try/catch and then have some fallback for those you can't process. There are other protocols out there where markup is escaped for other reasons and the input is expected, by the protocol, to be well-formed. If not, that's a bad message. In theory, the same is true for RSS. So, for example, you could write an XProc pipeline that checks whether all the description elements are correctly escaped XHTML by using unescape-markup and try/catch. -- --Alex Milowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics
Received on Tuesday, 8 May 2007 14:49:33 UTC