- From: Eric Richardson <maxwell@telesoft.com>
- Date: Wed, 31 May 2000 07:51:16 -0700
- To: DOM <www-dom@w3.org>
Hi, I posted this on comp.text.xml with no response so I was wondering if anyone could help me on this list. I recently switched to JAXP 1.0 from xml-tr2. I am using DTDs and asking for strict parsing something like this. DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setValidating(true); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(uri); My input file is human readable with whitespace like this. <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE doc SYSTEM 'doc.dtd'> <doc> <element1>first</element1> <element2>second</element2> </doc> My dtd is like this. <?xml version="1.0" encoding="ISO-8859-1"?> <!ELEMENT doc (element1, element2)> <!ELEMENT element1 (#PCDATA)> <!ELEMENT element2 (#PCDATA)> I'm getting empty TEXT nodes between the elements. Although I haven't checked , I would guess they are line feeds. I read the XML spec and it says that the parser should pass line feeds to the application which I believe in this case should be the DOM. In the DOM, I would hope that no TEXT nodes would be created in between ELEMENTS unless I specified a mixed content model with PCDATA. Is there something I don't understand about the encoding or something that would cause this? What do I need to do to avoid this? Thanks, Eric :-)
Received on Wednesday, 31 May 2000 10:55:02 UTC