W3C home > Mailing lists > Public > www-math@w3.org > October 2004

Re: html - xml transformation - imbricating html <p> tags in header tags

From: Robert Miner <RobertM@dessci.com>
Date: Wed, 20 Oct 2004 11:02:34 -0500
Message-Id: <200410201602.i9KG2Yh16473@wisdom.geomtech.com>
To: pzn_04@yahoo.fr
CC: Bernhard.Keil@soft4science.com, www-math@w3.org, DSSSList@lists.mulberrytech.com


Pascale,

It sounds to me as if you should start by using HTML Tidy, as Bernhard
suggested in his last paragraph.  No matter what, you are going to
need to go from HTML to some valid XML vocabulary, and this is just
what HTML Tidy is for.  Once you have XML, then a whole range of
possibilities for using XSL opens up.

You can get HTML Tidy at http://tidy.sourceforge.net/.

--Robert

------------------------------------------------------------------
Dr. Robert Miner                                RobertM@dessci.com
W3C Math Interest Group Co-Chair                      651-223-2883
Design Science, Inc.   "How Science Communicates"   www.dessci.com
------------------------------------------------------------------



> My html documents are valid xml, I need the syntax to nest paragraphs in
> sections/subsections  and subsections in parent sections.
> Thank you for your help,
> Pascale
> 
> 
> Bernhard Keil <Bernhard.Keil@soft4science.com> wrote:
> 
> Hello,
> you can use XSLT to transform from XML to XML (or non-xml).
> So the the input source has to be valid XML.
> 
> I general you cant transform html by XSLT, as html is not in general
> valid xml.
> But your source example is a valid XML document. So if all of your
> source documents
> are valid XML like this html source example, than you can use=20
> XSLT to transorm it to some other XML format.
> 
> If your html source documents are not valid XML documents,
> you can use a tool like Tidy to make it valid XML.
> 
> 
> regards,
> Bernhard Keil
> http://www.soft4science.com
Received on Wednesday, 20 October 2004 16:08:15 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 23:39:49 UTC