W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2007

TIDY XML output problem.

From: John Schroeder <tesoroytresor@hotmail.com>
Date: Thu, 15 Feb 2007 17:35:05 -0600
Message-ID: <BAY108-F287881F23A91F0DC119B2BAB960@phx.gbl>
To: html-tidy@w3.org

TIDY mailing list users:

I'm new to TIDY and I'm having a problem getting well-formed XML output from 
TIDY.

Input file:  XML with HTML embedded in the data.
Desired Output:  Well-formed XML that includes embedded HTML code.
Issue:  Well-formed XML isn't being generated. HTML tags like <p> and <BR> 
not being closed by TIDY.

Simple input file example with XML and HTML tags ( testXML.xml):
<?xml version="1.0" standalone="yes"?>
<myxmltag>
  <p>Here is an unclosed paragraph tag.
  <xmltag1>TAG1</xmltag1>
  <xmltag2>TAG2</xmltag2>
</myxmltag>

Executing:  "tidy -o newfile.xml -xml -asxml testXML.xml" returns the error: 
     line 6 column 1 - Error: unexpected </myxmltag> in <p>. This document 
has errors that must be fixed before using HTML Tidy to generate a tidied up 
version.

I expected it to close the <p> paragraph tag. I'd also like it to fix up 
unclosed <BR> HTML tags, as well cleaning up other tags if needed. Am I just 
not specifying an option correctly? I'm assuming the -asxml option is the 
same as using output-xml: yes.

Thanks for your assistance.

John

Straight XML example file:
<?xml version="1.0" standalone="yes"?>
<myxmltag>
  <xmltag1>TAG1</xmltag1>
  <xmltag2>Unclosed xmltag2
  <xmltag3>TAG3</xmltag3>
</myxmltag>

Trying to process the above file that has an unclosed <xmltag2> tag returns 
the following error: line 6 column 1 - Error: unexpected </myxmltag> in 
<xmltag2>

_________________________________________________________________
Search for grocery stores. Find gratitude. Turn a simple search into 
something more. 
http://click4thecause.live.com/search/charity/default.aspx?source=hmemtagline_gratitude&FORM=WLMTAG
Received on Sunday, 18 February 2007 14:58:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT