Re: Problems converting Latin1 to HTML

Abigail (abigail@tungsten.gn.iaf.nl)
Wed, 20 Dec 1995 06:02:50 +0100 (MET)


From: Abigail <abigail@tungsten.gn.iaf.nl>
Message-Id: <199512200502.GAA03454@tungsten.gn.iaf.nl>
Subject: Re: Problems converting Latin1 to HTML
To: philipp@res.enst.fr (Philippe-Andre Prindeville)
Date: Wed, 20 Dec 1995 06:02:50 +0100 (MET)
Cc: www-html@w3.org
In-Reply-To: <9512200548.ZM16694@jones.res.enst.fr> from "Philippe-Andre Prindeville" at Dec 20, 95 05:48:42 am

You, Philippe-Andre Prindeville wrote:
++ 
++ Hi.
++ 
++         I'm using perl 4.0pl36 (on an HP-UX 9.01 system) and perl 5.000
++ on a SunOS 4.1.3_U1 system, and I'm trying to convert accented (French)
++ text to HTML via:
++ 
++ 	$line =~ s/[&<>\200-\377]/sprintf("&#%d;", unpack("C", $1))/ge;
++ 
++ thinking this would convert all high-bit set characters to their
++ decimal equivalent as "&#nn;" but this isn't turning out as
++ expected.
++ 
++ 	I'm wondering about this.  Probably something stupid, but....
++ Anyone have a quick fix?

The problem is that $1 doesn't match anything... Use either:
$line =~ s/([&<>\200-\377])/sprintf("&#%d;", unpack("C", $1))/ge;
or:
$line =~ s/[&<>\200-\377]/sprintf("&#%d;", unpack("C", $&))/ge;

This should work too:
$line =~ s/[&<>\200-\377]/sprintf("&#%d;", ord ($&))/ge;



Abigail