- From: Randy Waki <rwaki@flipdog.com>
- Date: Mon, 14 Aug 2000 16:27:58 -0600
- To: <html-tidy@w3.org>
Nayan Hajratwala wrote: > > I am using JTidy to take an HTML file and output it as XML. > > It seems to work fine until I try to run the resulting file through > Sun's JAXP parser. In HTML documents that contain '©' for the > copyright symbol, JAXP says: 'Reference to undefined entity "©'. > > Is '©' not a valid XML entry? If so, is there a way to ensure that > JTidy does not output this? > > I also initially had the same problem with ' ', but calling > Tidy.setQuoteNbsp(false) fixed that. Try calling Tidy.setNumEntities(true). That tells Tidy to restrict itself to the 5 entities guaranteed to be defined in XML (< > " ' and &). All the rest are output as numeric escapes, so © becomes ©. You could then probably drop the call to setQuoteNbsp(). --Randy
Received on Monday, 14 August 2000 18:32:34 UTC