- From: Lee <ler762@gmail.com>
- Date: Fri, 8 Feb 2019 06:45:37 -0500
- To: Jacob Renhald <jacobrenhald@outlook.com>
- Cc: "html-tidy@w3.org" <html-tidy@w3.org>
On 2/7/19, Jacob Renhald <jacobrenhald@outlook.com> wrote:
> Been trying to tidy up the html strings on one of my websites, running the
> code through linux. For some reason I can't seem to make it work.
>
> I've run:
> sudo apt-get install tidy
>
> To "tidy" it up I go:
>
> curl localhost address | tidy -iq (please note I have all articles stored as
> a xhtml file).
>
> From my understanding the -q is for quiet input while the "i" is for indents
> and it fixes the main issue.
>
> I'm trying to tidy up all the htmls on this subpage:
> https://www.kredittkortinfo.no/artikler/, which is a big mess.
>
> Problem I'm running into is that the UTF8 gets translated into the ascii-USA
> version and I can no longer read the text file....I must be doing something
> wrong.
It looks like 'tidy -iq -utf8' should work:
$ man tidy
Character encodings
-utf8 use UTF-8 for both input and output
but it didn't for me with LANG=C
Just out of curiosity - what output does 'locale' give you?
This does work for me:
export LANG=nb_NO.utf8 (or en_US.utf8 or even C.utf8)
tidy -iq test.html
Regards,
Lee
Received on Friday, 8 February 2019 11:46:03 UTC