Re: Using xerces with Unicode from Joseph Kesselman on 2002-02-19 (www-dom@w3.org from January to March 2002)

From: Joseph Kesselman <keshlam@us.ibm.com>
Date: Tue, 19 Feb 2002 10:57:05 -0500
To: Aniruddha Shevade <ashevade@actuate.com>
Cc: www-dom@w3.org
Message-ID: <OFDD0B0987.AA14F367-ON85256B65.0057127B@pok.ibm.com>

For information about specific implementations, contact the folks
providing/supporting that implementation. Information about Xerces is
available from Apache's website and mailing lists; http://xml.apache.org is
a good starting point.

To briefly answer your question: Yes, Xerces is Unicode-aware. It may or
may not support specific encodings. Make sure your document starts with an
XML Declaration that states which encoding is being used in that file, so
the parser knows how to interpret it. If the encoding is an unusual one,
you may need to download and plug in IBM's encoding support package (I
think it's called ICU), which is distributed under a separate license.

If you think you've done everything correctly and are still having trouble,
post a question in the Xerces support mailing lists providing some detail
on what you're doing, what you're seeing, and why you think it's wrong and
someone in the Xerces community will investigate. If you're _sure_ it's a
Xerces bug rather than an error in your own code, post the gripe into the
Apache Bugzilla system; that's the best way to make sure it won't be
forgotten.

______________________________________
Joe Kesselman  / IBM Research

Received on Tuesday, 19 February 2002 10:57:46 UTC