Trying to tidy up HTML on website

Been trying to tidy up the html strings on one of my websites, running 
the
code through linux. For some reason I can't seem to make it work.

I've run:
sudo apt-get install tidy

To "tidy" it up I go:

curl localhost address | tidy -iq (please note I have all articles 
stored as
a xhtml file).

 From my understanding the -q is for quiet input while the "i" is for 
indents
and it fixes the main issue.

I'm trying to tidy up all the htmls on these pages
https://www.fornye.no & https://www.alarmsystem.no

Problem I'm running into is that the UTF8 gets translated into the 
ascii-USA
version and I can no longer read the text file....I must be doing 
something
wrong.

It looks like 'tidy -iq -utf8' should work:
$ man tidy
    Character encodings
        -utf8  use UTF-8 for both input and output

but it didn't for me with LANG=C

Just out of curiosity - what output does 'locale' give you?

This does work for me:
export LANG=nb_NO.utf8   (or en_US.utf8 or even C.utf8)
tidy -iq test.html

Regards,
Josef

Received on Monday, 23 May 2022 06:56:35 UTC