W3C home > Mailing lists > Public > www-validator@w3.org > August 2016

Re: Mandarin Pīnyīn is being misidentified as Vietnamese…

From: Michael[tm] Smith <mike@w3.org>
Date: Tue, 16 Aug 2016 18:49:08 +0900
To: Pinyin WW <pinyinww@gmail.com>
Cc: www-validator@w3.org
Message-ID: <20160816094908.txsjwglkmqldtx2t@sideshowbarker.net>
Pinyin WW <pinyinww@gmail.com>, 2016-08-11 09:28 -0700:
> Archived-At: <http://www.w3.org/mid/7C7EEB88-2F59-4315-A073-C26685FBE9D9@gmail.com>
> …as shown in this output:
> > Warning: This document appears to be written in Vietnamese but the html
> > start tag has lang="zh-Latn-pinyin". Consider using lang="vi" (or
> > variant) instead.

Thanks for reporting this.

To fix this what I’d like to have is a good command-line tool or library
for converting a (very) large amount of Chinese written with Han characters
into Pinyin.

Can you recommend a particular tool or library for converting to Pinyin?

I don’t care much what language the tool or library is in, as long as it
produces good Pinyin from the source.

A quick search turns up https://pypi.python.org/pypi/pinyin. Are you
familiar with that? Can you recommends it? Is there something better?


Michael[tm] Smith https://people.w3.org/mike

Received on Tuesday, 16 August 2016 09:54:25 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 18 August 2016 02:09:59 UTC