Minutes 2005-07-13: GEO telecon from Richard Ishida on 2005-07-19 (public-i18n-geo@w3.org from July 2005)

From: Richard Ishida <ishida@w3.org>
Date: Tue, 19 Jul 2005 12:06:10 +0100
To: "GEO" <public-i18n-geo@w3.org>
Message-Id: <20050719110609.226414F127@homer.w3.org>
Minutes 2005-06-13: GEO telecon, at at 17:00 UTC/GMT, 10:00 Seattle, 13:00 Boston, 18:00 London, 19:00 Paris, 03:00 Melbourne



ATTENDEES 

Deborah Cawkwell (BBC) 
David R Clarke (University of Sheffield) 
Richard Ishida (W3C, Chair) 




APOLOGIES 

Molly Holzschag (No affiliation) 
Russ Rolfe (Microsoft) 



ADDITIONAL ITEMS 

Good news - AC (National State Library of  Victoria) to re-join 

Questionnaire (http://www.w3.org/International/questions/test.html) 
'Who are you' question added & agreed by meeting 
Agreed to add 'eg' 

Felix will no longer be attending GEO telecons.  He will be focusing more on ITS and Core work.



ACTIONS 

RI What are NCRs & entities?, make editing changes listed in Discussion 
RI What are NCRs & entities?, after edits, etc, agreed to send out for wide review 

ALL  What should I consider wrt moving to UTF-8?, comment 
DC  What should I consider wrt moving to UTF-8?, do research & make editing changes listed in Discussion 
DC  What should I consider wrt moving to UTF-8?, (aim to) send out for wide review 

MEETINGS 

NEXT WEEK 

What should I consider wrt moving to UTF-8? 
>> Send for wide review? 
FAQ Initial draft RI Changing page encoding 
>> Initial comments 
FAQ Wide review RI Using <select> to Link to Localized Content (former wiki) 
>> Publish? 
FAQ Initial draft AP xml:lang in XML document schemas 
>> send for wide review 



REVIEW OF GEO WORK ITEMS 

IN THE PIPELINE 

AC 2 items - should come back online with AC 
Meta FAQ - RI needs more time 
Getting Started - MH to start working on that very shortly 

REVIEW & COMMENTS 

RI working on tutorials 
RI working on re-writing XSLT in a form which could be shared by other i18n authors (now 3) 
RI working on improving accessibility of site 
Techniques - Langugage declaration - now in wide review 
FAQ - xml:lang - Addison working on comments from next week; will prob attend next week 
XSLT for multilingual output after DC's other FAQ (Unicode) 
Changing page encoding scheduled for next week 



DISCUSSIONS 

FAQ: WHAT ARE NCRS & ENTITIES? 

Considered terminology: 'escape' (used currently) vs 'escape sequences' 
The latter may be an older term 
Character model uses 'character escape' 
ACTION RI to research whether 'escape' or 'character escape' is better
ACTION RI add link within FAQ to character model 

Discussion of use of abbreviated forms, eg "doesn't" as opposed to "does not" 
Desire to retain less formal style of the former 

ACTION RI delete "animal" - doesn't add anything & confusing. 

ACTION RI Character entity: point to a list of them within the text even though they are not recommended, but can be used & useful to see the list.

ACTION RI after edits, etc, agreed to send out for wide review 

FAQ: UPGRADING FROM LANGUAGE-SPECIFIC LEGACY ENCODING TO UNICODE ENCODING 

Felix commented that it was technically uneven, could he (or anyone else) point out where. 
ACTION ALL 

"[MD 22 mar] Maybe mention here that some mobile phones don't yet support UTF-8 (but some do, although with a limited range of characters)." Not sure what to do with this. What advice? 

DCR point that people will upgrade phones, requirement for any legacy encodings will decline. Also, different mobiles support different encodings (for same language)

FAQ is about what you should "consider", so the answer may not need to be so specific. Not all the answer. Developer should investigate for large mobile phone audience. Some suggested text "many do support..." Would be interesting to know which.

Move from background section to "how well is unicode suported for my end users?" in browser support section 

Browser support - check earliest editions that supported Unicode 
2 ways to address: research or cite modern versions without number 
ACTION DC edit browser support 

" No byte order problems as UTF-8 is 8-bit.[[RI I'd leave out the ' as UTF-8 is 8-bit']]" Done 

"Some languages currently require a font download; these languages include Pashto, Hindi, Urdu, Bengali. [[RI I'm not clear about this. On what platform - all? - think so but I'll check ]]"  for Windows 98 & below 

Standard installation of an operating system includes suitable fonts for the language selected by the user. Fonts not included in a standard installation can usually be added via menu options; they can also be downloaded. Some languages currently require a font download for Windows 98 and below; these languages include Pashto, Hindi, Urdu, Bengali.

Wrt: Fonts not included in a standard installation can usually be added via menu options; they can also be downloaded. 
i think you want to say: Fonts not available in a standard installation can often be downloaded from free sites by users, and you can point to those sites from your pages. Not desirable (RI) to be embed fonts in pages because the technology for that is proprietary and browser-specific.

ACTION DC to investigate font download possibilities 

"Multilingual text rendering engines are built into operating system and browser installation. [[RI we should explain that this is typically needed for 'complex scripts' such as ...]]" Checking  ...Hindi, Urdu, Persian, etc. (anything which has characters that change appearance based on their context)  DONE

ACTION DC edit multilingual text rendering engines: add Arabic & link to RI additional info on this - http://people.w3.org/rishida/scripts/tutorial/all.html#Slide0430

ACTION DC In Background section, 2nd para, starting "numerous.... -> "& you are maybe not sure...." 
In Answer, which Unicode encoding for web pages? add that UTF-16 is often used in the back-end 
Unicode is the Document Character Set for HTML and XML (link to http://www.w3.org/International/questions/qa-doc-charset)

ACTION DC In suitable fonts, correction script display "requires Unicode support at the application or operating system level and availability...." (para 1 in Suitable fonts)

ACTION DC re CSS, sounds like saying don't use named fonts. Example would make point clearly 
font display problems 
"Legacy code pages (eg.  ISO-8859-1/windows-1252): an operating system or browser either has a font installed for that encoding or it doesn't, therefore either the page displays correctly or no characters display (question marks)."

ACTION DC What I don’t need to worry about - strengthen point about markup which is large part of page remaining single byte

See RFC for UTF-8 re why not particularly heavier 

ACTION DC Reduce sub-headings, where only one below a heading: 
only one heading 'Page weight' 
same with following section - Don't forget -> Character encoding declaration 

ACTION DC Add information re combining data, eg, in use of server-side includes where encodings must match 

ACTION DC Last bit too terse: "You should read character encoding..." 

ACTION DC (aim to) send out for wide review next week 


http://www.bbc.co.uk/

This e-mail (and any attachments) is confidential and may contain
personal views which are not the views of the BBC unless specifically
stated.
If you have received it in error, please delete it from your system. 
Do not use, copy or disclose the information in any way nor act in
reliance on it and notify the sender immediately. Please note that the
BBC monitors e-mails sent or received. 
Further communication will signify your consent to this.
Received on Tuesday, 19 July 2005 11:06:14 UTC