html - xml transformation - imbricating html <p> tags in header tags

Hello,
 
I am trying to convert html files to xml files.  I need to nest (copy) subsections in parent sections, and paragraphs in the corresponding sections .
 
Is it possible to do it with xsl?
 
Here is an example (knowing that the files are more complex than this):
 
h1 is the document title, h2 section title, h3 subsection title. <p> should go under the corresponding section, and section (h3) should go under the parent section (h2) in the output xml.
 
<html>
<body>
<h1>Document Title</h1>
<p>Content of paragraph 1</p>
<h2>Header 2</h2>
<p>First paragraph of section 1</p>
<p>Second paragraph of section 1</p>
<h3>Header 3</h3>
<p>Test de p dans 2eme para 3</p>
<h2>Header 2</h2>
<p>First paragraph of section 2</p>
</body>
</html>
 
To be converted into:
<document>
<title>Document Title</title>
<paragraph>Content of paragraph 1</paragraph>
    <section>
        <title>Header 2>/title>
          <paragraph>First paragraph of section 1</paragraph>
           <paragraph>Second paragraph of section 1</paragraph>
                <section>
                    <title>Header 3</title>
                    <paragraph>First paragraph of subsection 1</paragraph>
                </section>
         </section>
        <section>
          <paragraph> First paragraph of section 2</paragraph>
      </section>
</document>
 
Thanks,
PZN


		
---------------------------------
Créez gratuitement votre Yahoo! Mail avec 100 Mo de stockage !
Créez votre Yahoo! Mail

Le nouveau Yahoo! Messenger est arrivé ! Découvrez toutes les nouveautés pour dialoguer instantanément avec vos amis.Téléchargez GRATUITEMENT ici !
--0-1846014824-1098274601=:81511
Content-Type: text/html; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

<DIV>Hello,<BR>&nbsp;<BR>I am trying to convert html files to xml files.&nbsp; I need to nest (copy) subsections in parent sections, and paragraphs in the corresponding sections .<BR>&nbsp;<BR>Is it possible to do it with xsl?<BR>&nbsp;<BR>Here is an example (knowing that the files are more complex than this):<BR>&nbsp;<BR>h1 is the document title, h2 section title, h3 subsection title. &lt;p&gt; should go under the corresponding section, and section (h3) should go under the parent section (h2) in the output xml.<BR>&nbsp;<BR>&lt;html&gt;<BR>&lt;body&gt;<BR>&lt;h1&gt;Document Title&lt;/h1&gt;<BR>&lt;p&gt;Content of paragraph 1&lt;/p&gt;<BR>&lt;h2&gt;Header 2&lt;/h2&gt;<BR>&lt;p&gt;First paragraph of section 1&lt;/p&gt;<BR>&lt;p&gt;Second paragraph of section 1&lt;/p&gt;<BR>&lt;h3&gt;Header 3&lt;/h3&gt;<BR>&lt;p&gt;Test de p dans 2eme para 3&lt;/p&gt;<BR>&lt;h2&gt;Header 2&lt;/h2&gt;<BR>&lt;p&gt;First paragraph of section 2&lt;/p&gt;<BR>&lt;/body&gt;<BR>&lt;/html&gt;<BR>&nbsp;<BR>To
 be converted into:<BR>&lt;document&gt;<BR>&lt;title&gt;Document Title&lt;/title&gt;<BR>&lt;paragraph&gt;Content of paragraph 1&lt;/paragraph&gt;<BR>&nbsp;&nbsp;&nbsp; &lt;section&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;title&gt;Header 2&gt;/title&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;paragraph&gt;First paragraph of section 1&lt;/paragraph&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;paragraph&gt;Second paragraph of section 1&lt;/paragraph&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;section&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;title&gt;Header 3&lt;/title&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;paragraph&gt;First paragraph of subsection
 1&lt;/paragraph&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/section&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/section&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;section&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;paragraph&gt; First paragraph of section 2&lt;/paragraph&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/section&gt;<BR>&lt;/document&gt;<BR>&nbsp;<BR>Thanks,<BR>PZN<BR></DIV><p>
		<hr size=1>
Créez gratuitement votre Yahoo! Mail avec <font color="red"><b>100 Mo de stockage !</b></font>
<br><a href="http://fr.rd.yahoo.com/mail/taglines/*http://fr.rd.yahoo.com/evt=25917/*http://fr.rd.yahoo.com/mail/mail_taglines_100/default/*http://fr.benefits.yahoo.com/">Créez votre Yahoo! Mail</a><br><br>
<font color="red"><b>Le nouveau Yahoo! Messenger est arrivé !</b></font> Découvrez toutes les nouveautés pour dialoguer instantanément avec vos amis.
<a href="http://fr.rd.yahoo.com/mail/taglines/*http://fr.rd.yahoo.com/evt=26111/*http://fr.rd.yahoo.com/messenger/mail_taglines/default/*http://fr.messenger.yahoo.com">Téléchargez GRATUITEMENT ici !</a>
--0-1846014824-1098274601=:81511--

Received on Wednesday, 20 October 2004 12:28:50 UTC