- From: Richard Ishida <ishida@w3.org>
- Date: Mon, 2 Jun 2003 14:15:14 +0100
- To: "'Tex Texin'" <tex@i18nguy.com>, "'Martin Duerst'" <duerst@w3.org>
- Cc: <public-i18n-geo@w3.org>
Hi Tex, I'd like to publish this today. I just called you but no answer. Can you give me a call, and we can quickly decide the way forward? +44 1753 480 292 Cheers, RI ============ Richard Ishida W3C tel: +44 1753 480 292 http://www.w3.org/International/ http://www.w3.org/People/Ishida/ > -----Original Message----- > From: public-i18n-geo-request@w3.org > [mailto:public-i18n-geo-request@w3.org] On Behalf Of Tex Texin > Sent: 02 June 2003 00:04 > To: Martin Duerst > Cc: public-i18n-geo@w3.org > Subject: Re: controls for GEO discusssion > > > > Martin, > good comments and I generally agree. > We should add an example using xml separators to make that > option clear. > > You didn't say never use controls in XML, but you came close.. > > "There is absolutely no need to use control codes in (X)HTML. > If anybody thinks otherwise, they didn't understand (X)HTML." > > But I made the leap to never. > > I think the one thing I disagree with (somewhat) is the links > vs. notes. I think in some cases the notes are needed to > clarify what is relevant in the links. In other cases, links > alone are fine. > > > Anyway, I think we can fix it to make everyone happy. Richard > wanted to post it last Friday. Richard you can either post it > with your final changes and then I will make further changes > for Martin's comments and give it back to you for subsequent > update, or you can turn it back to me and I will try to fix > further. Or you can fix it altogether if you want. > > > tex > > > Martin Duerst wrote: > > > > Hello Tex, > > > > At 03:13 03/05/31 -0400, Tex Texin wrote: > > > > >Hi, Good comments, although I have some disagreements, I > liked your > > >analysis and think the points worth discussing. > > > > > >0) Question wording. Yes, we agreed to change the wording to > > >something close to what you suggested. > > > > Very good. > > > > >1) Length. I think people may have a general idea of what controls > > >are but may not know the specifics, and especially the > specifics of > > >the ranges and the ranges in Unicode. > > > > Well, those who don't know what they are are probably not > interested > > in using them. And we should avoid giving the impression > that they are > > something important to know. > > > > >We could break the piece into multiple questions, but I > wonder about > > >how appropriate these backgrounders are for an i18n qa list... > > >Especially this early on. We could move more of the background > > >explanation below the question and answer. > > > > Yes, I think this is the best thing to do. > > > > >I think you are right the question and answer > > >should be succint, and at the top of the page but I don't see a > > >problem with additional clarifying and supporting > information being > > >available on the page, after the main point is discussed. > > > > > >If there is a strong objection to the background info, I would be > > >happy to move it to a page on my web site, GEO can have the short > > >version and GEO can optionally link to my page for more info. > > > > > >2) Relevance- I understand your questioning the topic, I > would have > > >done the same. It came about because in fact I was asked > the question > > >last week. Controls are not only used for manipulating > devices. They > > >have other uses. An application development environment I > am familiar > > >with does a lot of value-list processing. Depending on the > nature of > > >the data, the list separator is changed. e.g. if it's a list of > > >european decimals they would not want commas as a > separator. To avoid > > >conflicts between the list values and separators, in general > > >routines, they use 0x01, 0x02, etc. as separators. So they > have lots > > >of data in databases using these values. (Yes, they could have > > >instead adopted escape mechanisms instead.) > > > > I think this is a good, practical example. Whatever they do > in their > > database is not really our problem. > > > > >They ran into problems writing the data to xml. Some > software liked > > >it, others didn't. > > > > The software that tolerated it was faulty. > > > > >When they looked into the errors due to control codes not being > > >allowed they needed advice. Which are the disallowed > characters, and > > >what are the workarounds? Hence the article. > > > > Very good. For the example above, the best thing to do is > > to use XML for the separators. E.g. > value1<sep/>value2<sep/>value3... > > > > That's what XML is for. > > > > >I believe there may be a lot of data using controls, and > as with this > > >group, people may not have time to develop better solutions other > > >than writing the data out as NCRs. > > > > If they want to use XML, they should at least try to use it > the right > > way. And we should help them understand what the right way is. They > > can always decide to do something else on their own. > > > > >3) So because of 2, I claim if XML is for data > interchange, support > > >for interchange of controls is needed. > > > > Your example doesn't show that. XML has a perfect way of exchanging > > structured data. > > > > Also, in some way, the use case above looks like they just needed > > *any* character. Maybe converting it to a PUA character would be > > another solution (but I don't like that, either). > > > > >I can agree the needs are exotic. > > >You can argue that the data should instead be cleaned up, > but that is > > >impractical in some cases. In any event, it is worthwhile to let > > >people know what is and is not doable in *ML. > > >I don't mind giving more emphasis to cleaning up the data. > > > > Yes, I think we should do that. > > > > >I also don't mind emphasizing that control codes are to be > avoided, > > >and are bad for scalability and on the web. > > > > Yes, very good. > > > > >I would disagree with saying never use controls in XML. > > > > I haven't said that. > > > > >I would presume the reason support for controls as NCRs > was added, is > > >because some needs were identified for supporting controls. > > > > > > > > >4) separate rows for NL. I agree. > > > > > >5) encoding. I believe what we said, is that if the data > is in fact > > >binary, encoding is an option. Essentially, if it is binary, it is > > >not an i18n issue. > > > > The paragraph that mentions base64 does not say anything about the > > data being binary, and the problems for i18n if the data is > textual. > > If you have discussed that, and it's going to be updated, > that's good. > > > > Some more points: > > > > - For XML 1.1, clearly say that it is not yet a Recommendation. > > - If possible, don't use notes. In most cases, they can be > > replaced with a direct link. > > - In note 4, change "For example, eacute is the Character > Entity Reference" > > to "For example, é is the Character Entity Reference" > > > > Regards, Martin. > > > > >Richard, if you want to finish the changes you were going to make, > > >you can address Martin's comments or pass it back to me and I'll > > >address them. tex > > > > > > > > >Martin Duerst wrote: > > > > > > > > Hello Tex, > > > > > > > > Some more comments on your Q&A. > > > > > > > > Overall, I think that the answer is much too long. It not only > > > > answers 'How do ... support control codes', but also 'what are > > > > control codes', and so on. But the question assumes a basic > > > > knowledge of control codes. Peolpe who don't know these are not > > > > even interested in reading the answer. > > > > > > > > Also, I guess the real question is not how HTML and XML support > > > > control codes, but "How can I represent control codes > in HTML or > > > > XML". > > > > > > > > The basic message also should be improved. (X)HTML is a textual > > > > format used to represent text. There is absolutely no > need to use > > > > control codes in (X)HTML. If anybody thinks otherwise, > they didn't > > > > understand (X)HTML. I don't remember having been asked about > > > > control codes in (X)HTML at all. This should be clearly > reflected > > > > in the answer. > > > > > > > > XML in general is used both for text and for data. So > there may be > > > > some interesting use cases for control codes in XML. > The typical > > > > example would be an XML format for control code sequences for > > > > terminals (i.e. an XML version of a unix termcap file). > > > > > > > > Apart from such rather exotic examples, the main reason > that there > > > > are control codes in data usually is one of the following (most > > > > probably in the following order): > > > > > > > > - Pure garbage. The right thing is to clean up your data. > > > > > > > > - Old ways of representing data (starting with using Backspace > > > > to get accented versions of characters). The right thing > > > > is to convert your data, i.e. by doing the correct > transcoding > > > > or by adding markup. > > > > > > > > In the table, I suggest to have separate rows for CR/LF/TAB and > > > > for NEL (which is special in XML 1.1). > > > > > > > > The page says: "An alternative is to encode the data. > For example, > > > > encode the data as base64 or as hexadecimal values, to > ensure only > > > > supported characters are used in the markup language text." > > > > > > > > I'm very surprised to see this on an i18n-related page. > What this > > > > will do is that it will throw out of the window any and > all i18n > > > > features that XML has. So from an i18n viewpoint, we should not > > > > recommend it, we should indeed clearly recommend against it. > > > > > > > > Hope this helps. > > > > > > > > Regards, Martin. > > > > > > > > At 12:49 03/05/28 -0400, Tex Texin wrote: > > > > >I am not sure why, but the geo list isn't distributing > (my?) mail > > > since lst > > > > >night. > > > > > > > > > >Here is the controls page for q&a today. > > > > >I may be a little late to the meeting. > > > > > > > > > >http://www.i18nguy.com/test/controls.htm > > > > > > > > > >sorry, I don't have everyone's email. (maybe that's a > good thing. > > > > >;-) ) > > > > > > > > > >tex > > > -- > ------------------------------------------------------------- > Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com > Xen Master http://www.i18nGuy.com > > XenCraft http://www.XenCraft.com > Making e-Business Work Around the World > ------------------------------------------------------------- >
Received on Monday, 2 June 2003 09:15:51 UTC