W3C home > Mailing lists > Public > www-international@w3.org > April to June 2005

FW: Creating a PDF file with UTF-8 encoding through Servlet

From: Bruno Girin <Bruno.Girin@cambista.com>
Date: Tue, 24 May 2005 12:33:33 +0100
Message-ID: <2D25E0735620544FB83ABA9B07D6AC73093FAB@phaal.uk.cambista.net>
To: <www-international@w3.org>
Sorry, sent this message to Khurram only, not the list.

-----Original Message-----
From: Bruno Girin
Sent: Tue 5/24/2005 11:39 AM
To: Khurram Ilyas
Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet
Addison, that's the whole point of Sourav's question: a PDF file is binary file that contians text data. As a consequence, you need to specify the encoding of the text data so that the computer that will read the PDF can properly read the binary stream and translate it into the correct characters to display.

To achieve this, you need 3 things:
1. the servlet needs to encode the binary stream using an encoding that is able to encode the totality of the character set used in the document. If it is Japanese, the best encoding is probably UTF-8.
2. the servlet needs to specify that same encoding in the content type
3. the PDF file presumably needs to contain encoding data so that the file can be re-read by a PDF viewer independantly of the download

To do 1, you need to enclose the output stream into an OutputStreamWriter that specifies the encoding, such as:
Writer wout = new OutputStreamWriter(out, "UTF-8"); // out being the output stream obtained in Sourav's step 2
then you call wout.write() and other Writer methods

To do 2, you just specify the encoding as part of the content type:
response.setContentType("application/pdf; charset=utf-8");

3 is dependant on the API you're using to create your PDF file. I don't know PDFlib so can't tell you what the call is.

Good luck with this.

Bruno Girin
Chief Technical Architect
Cambista Technologies Ltd

-----Original Message-----
From: www-international-request@w3.org on behalf of Khurram Ilyas
Sent: Fri 5/20/2005 11:04 PM
To: addison.phillips@quest.com; SOURAVM@infosys.com; www-international@w3.org
Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet
Instead of 





Best Regards, 
Khurram Ilyas 

>From: "Addison Phillips" <addison.phillips@quest.com>
>To: "souravm" <SOURAVM@infosys.com>,<www-international@w3.org>
>Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet
>Date: Fri, 20 May 2005 09:14:10 -0700
>PDF files are binary, not text, objects.
>Addison P. Phillips
>Globalization Architect, Quest Software
>Chair, W3C Internationalization Core Working Group
>Internationalization is not a feature.
>It is an architecture.
> > -----Original Message-----
> > From: www-international-request@w3.org [mailto:www-international-
> > request@w3.org] On Behalf Of souravm
> > Sent: 2005?5?20? 6:13
> > To: www-international@w3.org
> > Subject: Creating a PDF file with UTF-8 encoding through Servlet
> >
> >
> > Hi All,
> >
> > I need to create and return back a PDF file from Servlet as a response to
> > http request (typical download functionality).
> >
> > Now for this purpose I'm -
> >
> > 1. First setting following fields in response onject -
> > response.setContentType("application/pdf");
> > response.setHeader("Pragma", "");
> > response.setHeader("Cache-Control", "");
> > response.setDateHeader("Expires", 0);
> >
> > 2. After that I'm creating an OutputStream object from the response object.
> >
> > 3. Using theat OutputStream object I'm wrting the content of the PDF file
> > (using APIs of PDFlib). Using PDFDocument.open(OutputStream) to create the
> > document object.
> >
> > 4. After writing the content of the PDF I'm closing the PDF file
> > (PDFDocument.close()).
> >
> > In this context, I'll like to know, don't I need to specify the encoding
> > of the PDF document through the setContentType API ? Say, I'm creating a
> > PDF file with Japanese content and I want the encoding of the file to be
> > of Shift_JIS.
> >
> > Any pointer/information on thios would be highly appreciated.
> >
> > Regards,
> > Sourav
> >
> >

This e-mail and attachments has been scanned for viruses. Please email virus@cambista.net if you have detected a virus in this mail.
Received on Tuesday, 24 May 2005 11:33:16 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:25 UTC