- From: Jeff Halperin <jeff@basistech.com>
- Date: Thu, 27 Apr 2000 17:07:30 -0400
- To: "'www-international@w3.org'" <www-international@w3.org>
- Cc: Amy Muntz <Amy@basistech.com>
If you decide to investigate products to handle this issue, my company offers a Chinese Morphological Analyzer and Japanese Morphological Analyzer. Product information can be found at http://www.basistech.com/products/ . >Can any one point me to books/RFCs/websites that explain the proper >way to break words for building a full text search database when parsing >HTML/XML in any of the following MBCS encodings: >UTF-8 >GB2312 >Shift-JIS >EUC-KR >Big5 >Thanks, Jeff > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Jeff Halperin One Kendall Square Tel: 617-252-5636 > Basis Technology Corp. Cambridge, MA 02139 Fax: 617-252-9150 > jeff@basistech.com U.S.A. www.basistech.com >
Received on Thursday, 27 April 2000 17:07:12 UTC