- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 7 Jul 2006 12:36:23 +0100
- To: "'Martin Duerst'" <duerst@it.aoyama.ac.jp>, "'GEO'" <public-i18n-geo@w3.org>
Hi Martin, Thanks for your (very detailed) comments. I've been working through them... > From: Martin Duerst [mailto:duerst@it.aoyama.ac.jp] > Sent: 03 July 2006 11:49 ... > The document should have a list of best practices right at the top. > (see e.g. the WebArch doc) I'll consider that. > > Changing the indent (front material up to TOC is indented ca. > 2cm, starting with 1 Intro, the indent changes to ca. 4cm) > doesn't make sense. Same with the additional indent for the > references (which would make sense if the indent were applied > to the reference text but not to the labels). Yes, that was just a quick fix. I'll look at again shortly. I was wondering whether we could get away from the full width text, since that makes documents harder to read, but I'm not sure how to do it. I plan to check out some other recent specs, like MWBP, to see if they have any good ideas. XXXX > > In 3.1: > "Metadata about the language of the intended audience is > about the document as a whole." > > The final clause sounds a bit unprofessional. Also, the > repetition of the 'about' should be avoided. What about: "The > language of the intended audience is metadata on the document > as a whole. Actually it is an intentional echo construct intended to draw a parallel between the two subclauses. It doesn't sound unprofessional to me. > > BP1: > "Always declare the default text-processing language of the > page, using attributes on the html tag, unless the intended > audience speaks multiple languages." > > The first comma is confusing. The comma seems to indicate > that it's just e.g. one way of doing things, but then this > shouldn't be part of the actual BP text. > Yes. Fixed. > further down: > "the intended audience is expected to read content in more > than one language (eg. a multilingual blog, or a page aimed > at more than one language community)": This is confusing. For > a page aimed at more than one language community, isn't the > reason this page contains more than one language just that > the audience is NOT expected to read content in more than one > language? I.e. for the Canadian example, English and French > are there not because everybody reads English and French, but > just because some people don't read both languages. I think you are constraining your thinking too much to just one of the scenarios mentioned above. A multilingual blog in two languages is typically aimed at an audience that speaks both languages, and switches between languages depending on the preference of the writer. A page with parallel content is a somewhat different scenario, although the audience of the document itself is still a multilingual community. > > "it may make more sense to declare the default > text-processing language lower down in the document than in > the html tag." > 'lower down' is extremely colloquial, and does not make clear > that this is lower down in the document hierarchy, and not > lower down in the document flow. Also, 'html tag' is > colloquial. There are two html tags, a start tag and an end tag. This isn't a specification, so I'm happy to be a little colloquial, as long as the meaning is not impaired. I suspect that people reading this will be aware that putting language information in the end tag probably won't be a good idea, we don't need to spell that out for them ;-) In fact, having just heard a great deal of feedback from designers about WCAG 2.0, and having reviewed it myself, I'm keen to avoid sounding too 'speccy', and find myself wondering whether I should try to deformalised some more of the text. (But I won't.) > > "Best Practise 2: html declarations for multilingual docs" > Practise -> Practice; docs->documents; 'html declarations'-> expand Changed. > > BP1, 2, and 6 all deal with putting something on the HTML tag. > While I agree it's important because in the frequent > monolingual case, it's the only thing people have to do, it > still somehow feels like overkill. > > "Best Practise 4: Should I use the lang or xml:lang attribute?" > Some BP titles directly give the BP. This is best; ideally, > all should be like that. BP2 just gives the topic, so it > might be improved. Having a BP as a question is really confusing. > BP7 again is a problem; it's a subclause, easily the worst > grammatical entity to go into a title. Hmm. This is not easy. See a separate mail to follow. > > Example 15/16 use a ">"/">" that isn't needed for this > example and probaly will confuse a few people. I replaced it with an image, although I'm not sure that's much better: it moves away from being a real example, and adds lots of markup to the example... > > BP8: the <code> text is too small, the style sheet has to be > fixed That's a browser issue. There is no special sizing applied to the <code> text, and mine looks the same size as the normal text (and certainly wider). XXXX >This BP should also say that the HTTP header may be > preferable because on some servers (Apache in particular), > language negotiation is done by looking at the headers in a > HTTP HEAD subrequest. I don't understand why the HTTP header may be preferable in the case of language negotiated content. The language negotiation process does seem to have the side-effect of sending Content-Language information with the HTTP header, but I don't see why that relates to advice to use http headers for declaring language. The negotiation is based on information that comes from the browser, not the document. I did, however, add the following para: "Sometimes a server has been set up to automatically serve a language-specific version of a resource based on the user's browser settings (content negotiation). In this case, your server is likely to send language information in the Content-Language header." > > BP10: "Dividing parallel text at the highest possible level, > can simplify..." > Comma seems unnecessary/counterproductive. Fixed. > > Best Practise 11: Use RFC3066bis or its successor: There is > absolutely no mention of BCP 47, but this may be very helpful > in this case. BCP 47 is the number the IETF uses to denote > "RFC3066(bis) or its successor". > > Best Practise 12: Use short language codes: These are > language tags, not language codes (language codes are those > things in ISO 639-x). Fixed > > "Although RFC 3066bis introduces script tags, as RFC 3066bis > co-author, Addison Phillips, writes, "For virtually any > content that does not use a script tag today, it remains the > best practise not to use one in the future"." > This is a nice quote, but doesn't look appropriate in the > context of this document. I don't see why. > > "In the past, there was often some confusion about which ISO > language code to choose, since there often 2-letter and > 3-letter alternatives for the same language (and sometimes > two 3-letter alternatives). This question is now moot because > you should only use language tags specified in the > IANALanguage Subtag Registry, and only one subtag exists per > language in that registry (the shortest one)." > This implies that this question was open in RFC 3066. This is wrong. > RFC 3066 made it very clear that if there was a two-letter > code, that had to be used, and there were two-letter codes > for all languages that had two three-letter codes. Pointing > to the subtag registry (after adding a space, probably best > by including IANA into the link) is a good idea, but it > shouldn't result in creating confusion where there was none. Reworded. > > Best Practise 13: Use Hans and Hant codes It would be better > if this practice was worded more generally, e.g. > "Use script codes to distinguing language variants that > differ by script, rather than using a country where this > variant is prevalent." No. Definitely don't want to do this, since I don't think we need to encourage people to use script codes in general. On the other hand, general use of zh-Hant and zh-Hans is a very good idea, needs to be widely known, and is currently in need of more visibility. > > [Ed. note: This best practise has also been rewritten to > reflect changes in RFC 306bis.] RFC 306bis -> 3066bis > > BP14: Yet another way to title a BP: pros and cons. Adding > just one word (Consider) creates something that says what it > means out of something that doesn't sound like a best practice at all. > > Example 22 uses the actual attribute value, since these > two-letter codes are typically recognizable by speakers of > the language. > I strongly doubt this. Have you done some research? Did you > ask people on the street, or is there some data you are > basing this on? I think most people will, yes. Have you evidence to the contrary? I have softened the wording, nontheless.
Received on Friday, 7 July 2006 11:36:34 UTC