- From: Paul Libbrecht <paul@activemath.org>
- Date: Fri, 04 Jul 2003 14:28:57 +0200
- To: www-math@w3.org
-------- Original Message -------- Subject: Re: xsl transformation and special characters Date: Fri, 04 Jul 2003 14:25:00 +0200 From: Paul Libbrecht <paul@activemath.org> To: Morten Andersen <mortena@mip.sdu.dk> References: <5.2.1.1.0.20030630150751.01c89c18@mailhost.mip.sdu.dk> Morten, Your bug seems similar to the bug we have encountered. It goes as follows: The HTML specifications do NOT give a way to specify the encoding of the data being sent to the script/servlet/... The reason... the encoding attribute of the form element is meant to contain this stupid default value x-www-url-form-encoded. This "encoding" is just a low level way of transmitting bytestreams and tells that the byte 2E should become %2E and... However, it is not clear which text-encoding should be used to prepare this bytestream. The current practice we have observed was that the browser sent the text using the same encoding as the page it was originating from (some browsers have a setting for this even). What you are seeing seems to be a utf-8 translation of the given letters. You're now left with decoding this stream. This is to be added with a bug in Tomcat we have faced, namely that Tomcat has also jumped over that bit of specification (which anyways is needed from all other sides): and the request.getParameter(paramName) responds a String made of characters whose first bytes are the bytes of this x-www-url-form-encoded. We had, thus, to pipe these request.getParameter() through a UTF-8 java.io.InputStreamReader and we are now able to accept russian, math, and just about anything Unicode 3 in our forms (this is in the ActiveMath project). Hope that helps. Paul Morten Andersen wrote: > Dear mathml experts > > I'm trying to develop an application where text and mathematics can be > edited online using a browser. I've runned into a few problems doing that: > > I've made a XHTML page, where the end-user should be capable of editing > a text containing special letters like the danish æ, ø an å. This page > is rendered just fine, but something happens as the text is sent from > the textarea to the server that translates the letters: > > * æ > æ > * ø > gø > > > This does not happen in a html page... First as I started using XML the > problem occurred. Here is part of the XML page: > > > <?xml version="1.0" ?> > <?xml-stylesheet type="text/xsl" href="pmathml.xsl"?> > <!-- > pref:renderer="techexplorer-plugin" > pref:renderer="techexplorer" > pref:renderer="css" > pref:renderer="mathplayer" > pref:renderer="mathplayer-dl" > --> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="da"> > ..... > > > <TEXTAREA CLASS="editor"> > æøå > </TEXTAREA> > ... > > > As I submit the form with the Textarea, the letters: æ,ø and å are > translated. But only in the browsers: ie6 and opera, not in Netscape or > Mozilla. > > It seems that the xsl transformations are causing the problems with the > special characters. So I tryed a version like the one beneath: > > <?xml version="1.0"?> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" > "http://www.w3.org/Math/DTD/xhtml-math11-f.dtd"> > > <html xmlns="http://www.w3.org/1999/xhtml" > xmlns:m="http://www.w3.org/1998/Math/MathML"> > <head> > <OBJECT ID="MathPlayer" > CLASSID="clsid:32F66A20-7614-11D4-BD11-00104BD3F987"></OBJECT> > <?IMPORT NAMESPACE="m" IMPLEMENTATION="#MathPlayer"?> > > This solves the problem with special character inputs but this causes > the following problems: > > * I have to write the mathml like: <m:math>... instead of: <math > xlms="..." >, this is a problem if I want to make it possible to > copy mathml from programs like Mathematica, > * I can't use the WEB-EQ input-control without writing a class-id > statement in the header. > * I'm not sure tha I can use both mathplayer and input-control on > the same page. > > 1) If I use xsl transformations the user-input from html forms in > Internet Explorer and Opera translates special characters like æ,ø and > mathematical symbols like the integral symbol before they reach the server. > 2) If I use the <HTML doctype way then I have to write the mathml like: > <m:math>... instead of: <math xlms="..." >, this is a problem if I want > to make it possible to copy mathml from programs like Mathematica, plus > I can't use the WEB-EQ input-control without writing a class-id > statement in the header. > > I think the solution could be something with entering some extra xsl > files in the xsl way or translating the mathml on the server, depending > on whether it must be edited or viewed. > > How can this be solved? > > Thanks > > Morten Andersen > Denmark
Received on Friday, 4 July 2003 08:32:09 UTC