W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2001

Re: to XML, not XHTML

From: Ignacio Vazquez-Abrams <ignacio@openservices.net>
Date: Wed, 29 Aug 2001 02:18:42 -0400 (EDT)
To: <html-tidy@w3.org>
Message-ID: <Pine.LNX.4.33.0108290213450.18216-100000@terbidium.openservices.net>
On Wed, 29 Aug 2001, Matt G wrote:

> Is their a way to force Tidy to ignore "HTML good/bad-ness" and only convert
> badly formed HTML into well-formed XML (which should be much more
> efficient). Or is there another utility (COM interface preferred,
> command-line okay, no GUI allowed) that will do this?
>
> I don't care about producing good HTML/XHTML, all I need is to produce
> something I can shove into an XML parser and use XPath/XSLT to extract data.
> It will be used by automation scripts and robots.
>
>     Matt

Heh.

It has to conform to a certain specification in order to use XPath and/or XSLT
to extract data. I have a very similar process to what you want above:

1) Cut two six-inch strips from a roll of electrical tape.
2) Place the two strips of tape on a wall such that they form an X with the
     intersection at eye-level.
3) Repeatedly thrust forehead at the intersection of the two pieces of tape.

In other words, you might be able to do it, but DON'T. It's not worth it. Get
it to conform and then you can do whatever you want to it however you want.

-- 
Ignacio Vazquez-Abrams  <ignacio@openservices.net>
Received on Wednesday, 29 August 2001 02:18:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:46 GMT