(unknown charset) Re: Tidy piping in perl

On Tue, 25 May 1999, Bjoern Hoehrmann wrote:

> --snip--
>    If the file handle came from a piped open `close()' will
>    additionally return FALSE if one of the other system
>    calls involved fails or if the program exits with non-
>    zero status. (If the only problem was that the program
>    exited non-zero `$!' will be set to `0'.) Also, closing
>    a pipe waits for the process executing on the pipe to
>    complete, in case you want to look at the output of the
>    pipe afterwards. Closing a pipe explicitly also puts the
>    exit status value of the command into `$?'.
> --snip--
> 
> So replacing the old close command with something like
> 
> --snip--
> if (close(TIDY) == 0) {
>   my $exitcode = $? >> 8;
>   if ($exitcode == 1) {
>     printf STDERR "tidy issued warning messages\n";
>   } elsif ($exitcode == 2) {
>     printf STDERR "tidy issued error messages\n";
>   } else {
>     die "tidy exited with code: $exitcode\n";
>   }
> } else {
>   printf STDERR "tidy detected no errors\n";
> }
> --snip--
> 
> Works fine.

Great - Tidy returns an error code when it detects problems
with the markup, the following is from main():

    /* return status can be used by scripts */

    if (totalerrors > 0)
        return 2;

    if (totalwarnings > 0)
        return 1;

    /* 0 signifies all is ok */
    return 0;

> The IE5 supports an option to add tools to the PopUp menu that
> can e.g. open a new window and load a specific Url. I made up a
> tool, that implements Jigsaw and the Validator to check the Page
> just by two simple clicks. Another one should be Tidy.

Would you like to share your know-how on the latter?

> Another question: I read, that entities in a <head>er are not
> converted back to e.g. "ö" or whatever by Searchengines and
> alike. The content of a meta tag is CDATA and so entitys are
> allowed. iirc Tidy converts those characters to entitys. Thats,
> if i'm right about the Searchengines, not very useful, maybe an
> option to turn off the modification of header fields may be
> useful.

I will try to find out more about this.

> Another thing is the specified Charset. Iirc when i use the
> correct Charset it becomes optional to use 'ö' or '&ouml;' so
> this replacement isn't necessary (if i'm right, and tidy does
> this replacement)

You can choose whether you want Tidy to emit character entities
or raw characters for Latin-1, either via the command line or via
the config file.

> The last thing is the use of this mailing list. Is it correct to
> reply to the author and put a CC: to html-tidy@w3.org

Yes - that's fine. The html-tidy list is archived and the archive
is accessible via the web.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
phone: +44 122 578 2984 (or 2521) +44 385 320 444 (gsm mobile)
World Wide Web Consortium (on assignment from HP Labs)

Received on Thursday, 3 June 1999 06:59:45 UTC