Re: unescaping markup

/ Alex Milowski <alex@milowski.org> was heard to say:
[...]
| Ah... I missed that last bit.
|
| Maybe we should have a "content-type" option that would allow you to
| specify something like "text/html".  What happens for HTML would have
| to be implementation defined because there is no definition of what
| "make it well-formed" means.
|
| I think if you are parsing an XML-typed media, it should be at least
| well-formed in accordance with the XML 1.0/1.1 specifications.  If you
| specify a non-XML media type, then anything appropriate for that
| media type can happen.  This gives implementors the option of
| using unregister media types like: "application/x-random-junk" or
| "application/vnd-tidy-html".

With respect, I think you're still missing the point. Perhaps our
experiences are different, but the escaped markup that I've encountered
in the wild is, when unescaped, not well formed about 99 times out of
100.

So to my mind that means the unescape markup step is going to fail
99 times out of 100 which doesn't seem very useful.

So it seems like it should have a "fix the broken $@$#%@! markup"
option, even if the exact details of how it does the fixup are
implementation dependent.

But I don't really care very much since I think escaped markup is
inherently a bad idea.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | We measure the excellency of other men,
http://nwalsh.com/            | by some excellency we conceive to be in
                              | ourselves.--John Selden

Received on Monday, 14 May 2007 14:39:13 UTC