- From: Addison Phillips <addison.phillips@quest.com>
- Date: Tue, 24 May 2005 08:01:28 -0700
- To: "Bruno Girin" <Bruno.Girin@cambista.com>, <www-international@w3.org>
No, that's not right. The PDF file is a binary file. The text *INSIDE* the file (i.e. the text being encoded by the PDF library) has an encoding. But PDF file themselves do not have or need a charset parameter. Putting a charset parameter on a Content-Type of "application/*" is just silly. Your browser does not read the text in a PDF. It calls the Acrobat plug-in which read the Acrobat file. Addison Addison P. Phillips Globalization Architect, Quest Software Chair, W3C Internationalization Core Working Group Internationalization is not a feature. It is an architecture. > -----Original Message----- > From: www-international-request@w3.org [mailto:www-international- > request@w3.org] On Behalf Of Bruno Girin > Sent: 2005?5?24? 4:34 > To: www-international@w3.org > Subject: FW: Creating a PDF file with UTF-8 encoding through Servlet > > Sorry, sent this message to Khurram only, not the list. > > > -----Original Message----- > From: Bruno Girin > Sent: Tue 5/24/2005 11:39 AM > To: Khurram Ilyas > Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet > > Addison, that's the whole point of Sourav's question: a PDF file is binary > file that contians text data. As a consequence, you need to specify the > encoding of the text data so that the computer that will read the PDF can > properly read the binary stream and translate it into the correct > characters to display. > > To achieve this, you need 3 things: > 1. the servlet needs to encode the binary stream using an encoding that is > able to encode the totality of the character set used in the document. If > it is Japanese, the best encoding is probably UTF-8. > 2. the servlet needs to specify that same encoding in the content type > 3. the PDF file presumably needs to contain encoding data so that the file > can be re-read by a PDF viewer independantly of the download > > To do 1, you need to enclose the output stream into an OutputStreamWriter > that specifies the encoding, such as: > Writer wout = new OutputStreamWriter(out, "UTF-8"); // out being the > output stream obtained in Sourav's step 2 > then you call wout.write() and other Writer methods > > To do 2, you just specify the encoding as part of the content type: > response.setContentType("application/pdf; charset=utf-8"); > > 3 is dependant on the API you're using to create your PDF file. I don't > know PDFlib so can't tell you what the call is. > > Good luck with this. > > Bruno Girin > Chief Technical Architect > Cambista Technologies Ltd > > > -----Original Message----- > From: www-international-request@w3.org on behalf of Khurram Ilyas > Sent: Fri 5/20/2005 11:04 PM > To: addison.phillips@quest.com; SOURAVM@infosys.com; www- > international@w3.org > Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet > > Instead of > > response.setContentType("application/pdf"); > > > > try > > response.setContentType("application/download"); > > > > > > > Best Regards, > Khurram Ilyas > > > > > >From: "Addison Phillips" <addison.phillips@quest.com> > >To: "souravm" <SOURAVM@infosys.com>,<www-international@w3.org> > >Subject: RE: Creating a PDF file with UTF-8 encoding through Servlet > >Date: Fri, 20 May 2005 09:14:10 -0700 > > > > > >PDF files are binary, not text, objects. > > > >Addison > > > >Addison P. Phillips > >Globalization Architect, Quest Software > >Chair, W3C Internationalization Core Working Group > > > >Internationalization is not a feature. > >It is an architecture. > > > > > -----Original Message----- > > > From: www-international-request@w3.org [mailto:www-international- > > > request@w3.org] On Behalf Of souravm > > > Sent: 2005?5?20? 6:13 > > > To: www-international@w3.org > > > Subject: Creating a PDF file with UTF-8 encoding through Servlet > > > > > > > > > Hi All, > > > > > > I need to create and return back a PDF file from Servlet as a response > to > > > http request (typical download functionality). > > > > > > Now for this purpose I'm - > > > > > > 1. First setting following fields in response onject - > > > response.setContentType("application/pdf"); > > > response.setHeader("Pragma", ""); > > > response.setHeader("Cache-Control", ""); > > > response.setDateHeader("Expires", 0); > > > > > > 2. After that I'm creating an OutputStream object from the response > object. > > > > > > 3. Using theat OutputStream object I'm wrting the content of the PDF > file > > > (using APIs of PDFlib). Using PDFDocument.open(OutputStream) to create > the > > > document object. > > > > > > 4. After writing the content of the PDF I'm closing the PDF file > > > (PDFDocument.close()). > > > > > > In this context, I'll like to know, don't I need to specify the > encoding > > > of the PDF document through the setContentType API ? Say, I'm creating > a > > > PDF file with Japanese content and I want the encoding of the file to > be > > > of Shift_JIS. > > > > > > Any pointer/information on thios would be highly appreciated. > > > > > > Regards, > > > Sourav > > > > > > > > > > > > > > > > > > > _____________________________________________________________________ > This e-mail and attachments has been scanned for viruses. Please email > virus@cambista.net if you have detected a virus in this mail.
Received on Tuesday, 24 May 2005 15:01:32 UTC