- From: Tex Texin <tex@i18nguy.com>
- Date: Sat, 31 May 2003 03:13:28 -0400
- To: Martin Duerst <duerst@w3.org>
- CC: public-i18n-geo@w3.org
Hi, Good comments, although I have some disagreements, I liked your analysis and think the points worth discussing. 0) Question wording. Yes, we agreed to change the wording to something close to what you suggested. 1) Length. I think people may have a general idea of what controls are but may not know the specifics, and especially the specifics of the ranges and the ranges in Unicode. We could break the piece into multiple questions, but I wonder about how appropriate these backgrounders are for an i18n qa list... Especially this early on. We could move more of the background explanation below the question and answer. I think you are right the question and answer should be succint, and at the top of the page but I don't see a problem with additional clarifying and supporting information being available on the page, after the main point is discussed. If there is a strong objection to the background info, I would be happy to move it to a page on my web site, GEO can have the short version and GEO can optionally link to my page for more info. 2) Relevance- I understand your questioning the topic, I would have done the same. It came about because in fact I was asked the question last week. Controls are not only used for manipulating devices. They have other uses. An application development environment I am familiar with does a lot of value-list processing. Depending on the nature of the data, the list separator is changed. e.g. if it's a list of european decimals they would not want commas as a separator. To avoid conflicts between the list values and separators, in general routines, they use 0x01, 0x02, etc. as separators. So they have lots of data in databases using these values. (Yes, they could have instead adopted escape mechanisms instead.) They ran into problems writing the data to xml. Some software liked it, others didn't. When they looked into the errors due to control codes not being allowed they needed advice. Which are the disallowed characters, and what are the workarounds? Hence the article. I believe there may be a lot of data using controls, and as with this group, people may not have time to develop better solutions other than writing the data out as NCRs. 3) So because of 2, I claim if XML is for data interchange, support for interchange of controls is needed. I can agree the needs are exotic. You can argue that the data should instead be cleaned up, but that is impractical in some cases. In any event, it is worthwhile to let people know what is and is not doable in *ML. I don't mind giving more emphasis to cleaning up the data. I also don't mind emphasizing that control codes are to be avoided, and are bad for scalability and on the web. I would disagree with saying never use controls in XML. I would presume the reason support for controls as NCRs was added, is because some needs were identified for supporting controls. 4) separate rows for NL. I agree. 5) encoding. I believe what we said, is that if the data is in fact binary, encoding is an option. Essentially, if it is binary, it is not an i18n issue. Richard, if you want to finish the changes you were going to make, you can address Martin's comments or pass it back to me and I'll address them. tex Martin Duerst wrote: > > Hello Tex, > > Some more comments on your Q&A. > > Overall, I think that the answer is much too long. It not only > answers 'How do ... support control codes', but also 'what are > control codes', and so on. But the question assumes a basic > knowledge of control codes. Peolpe who don't know these > are not even interested in reading the answer. > > Also, I guess the real question is not how HTML and XML > support control codes, but "How can I represent control > codes in HTML or XML". > > The basic message also should be improved. (X)HTML is a > textual format used to represent text. There is absolutely > no need to use control codes in (X)HTML. If anybody thinks > otherwise, they didn't understand (X)HTML. I don't remember > having been asked about control codes in (X)HTML at all. > This should be clearly reflected in the answer. > > XML in general is used both for text and for data. So > there may be some interesting use cases for control > codes in XML. The typical example would be an XML > format for control code sequences for terminals > (i.e. an XML version of a unix termcap file). > > Apart from such rather exotic examples, the main reason > that there are control codes in data usually is one of > the following (most probably in the following order): > > - Pure garbage. The right thing is to clean up your data. > > - Old ways of representing data (starting with using Backspace > to get accented versions of characters). The right thing > is to convert your data, i.e. by doing the correct transcoding > or by adding markup. > > In the table, I suggest to have separate rows for > CR/LF/TAB and for NEL (which is special in XML 1.1). > > The page says: "An alternative is to encode the data. For example, > encode the data as base64 or as hexadecimal values, to ensure only > supported characters are used in the markup language text." > > I'm very surprised to see this on an i18n-related page. > What this will do is that it will throw out of the window > any and all i18n features that XML has. So from an i18n > viewpoint, we should not recommend it, we should indeed > clearly recommend against it. > > Hope this helps. > > Regards, Martin. > > At 12:49 03/05/28 -0400, Tex Texin wrote: > >I am not sure why, but the geo list isn't distributing (my?) mail since lst > >night. > > > >Here is the controls page for q&a today. > >I may be a little late to the meeting. > > > >http://www.i18nguy.com/test/controls.htm > > > >sorry, I don't have everyone's email. (maybe that's a good thing. ;-) ) > > > >tex > > > >-- > >------------------------------------------------------------- > >Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com > >Xen Master http://www.i18nGuy.com > > > >XenCraft http://www.XenCraft.com > >Making e-Business Work Around the World > >------------------------------------------------------------- -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Saturday, 31 May 2003 03:14:03 UTC