Re: How to convert things like ã to utf-8?

Hi Peng,

Thank you for your inquiry...

Could you add it as an issue on -
  https://github.com/htacg/tidy-html5/issues
where I am sure it will get more attention...

Also add the version of tidy used, and the
expected output... thanks...

Regards,
Geoff.

On 17/05/16 15:11, Peng Yu wrote:
> Hi,
>
> For the following xml, I want to convert things like ã to utf-8.
>
> http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?sortfield=py&hc=1000&sortorder=desc&an=6706948
>
> But I still see things like ã with the following command. Does
> anybody know what is the correct command to do the conversion? Thanks.
>
> ~$ curl "http://ieeexplore.ieee.org/gateway/ipsSearch.jsp?sortfield=py&hc=1000&sortorder=desc&an=6706948"
>> tmp1.xml
> ~$ tidy -q -xml --preserve-entities no --output-encoding utf8 tmp1.xml
>> tmp2.xml
> ~$ vim tmp1.xml
> ~$ grep Bilz tmp2.xml
> <![CDATA[Bilz&#x00E3;  Ara&#x00FA; jo;  Liang Zhao]]>
>
> --
> Regards,
> Peng
>

Received on Wednesday, 18 May 2016 15:57:14 UTC