"Dervla O'Keeffe" <okeeffda@tcd.ie> wrote: > I am currently using Xerces DOM parser (org.apache.xerces.parsrs.DOMParser) to > try and parse HTML by using the transitional HTML 4.01 DTD as an external DTD > with an input xml (html) file, as per DTD on the W3 subsite at: > > http://www.w3.org/TR/REC-html40/loose.dtd > (I am aware that HTML is not XML, which is why I am using the DTD.) Since HTML is not XML, it is not appropriate to use an XML parser to parse the HTML 4 DTD. > I do not understand the following line of the DTD: > > <!ELEMENT (%fontstyle;|%phrase;) - - (%inline;)*> This syntax is allowed in SGML but not in XML, that's why Xerces complained. Regards, -- Masayasu Ishikawa / mimasa@w3.org W3C - World Wide Web ConsortiumReceived on Tuesday, 14 January 2003 09:21:28 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:49:30 GMT