[minutes] Internationalization telecon 2014-08-21

http://www.w3.org/2014/08/21-i18n-minutes.html




Text version follows:


Internationalization Working Group Teleconference

21 Aug 2014

    [2]Agenda

       [2] 
https://lists.w3.org/Archives/Member/member-i18n-core/2014Aug/0042.html

    See also: [3]IRC log

       [3] http://www.w3.org/2014/08/21-i18n-irc

Attendees

    Present
           Addison, Richard, JcK, Leandro, Mati, Koji, PLH_(guest)

    Regrets
           Felix

    Chair
           Addison Phillips

    Scribe
           Addison Phillips

Contents

      * [4]Topics
          1. [5]Agenda
          2. [6]Action Items
          3. [7]Info Share
          4. [8]RADAR
          5. [9]Encoding LC
          6. [10]AOB?
      * [11]Summary of Action Items
      __________________________________________________________

Agenda

Action Items

    [12]http://www.w3.org/International/track/actions/open

      [12] http://www.w3.org/International/track/actions/open

    close action-332

    <trackbot> Closed action-332.

Info Share

    leandro: coming to TPAC

    richard: new community group
    ... looking for members
    ... character description group

    <r12a> [13]http://www.w3.org/community/cdl/

      [13] http://www.w3.org/community/cdl/

    richard: describing characters based on strokes

    JcK: ietf vs unicode issue about precombined marks suspended by
    not gone away

    <JcK> Especially if it is to be an action item, that should be
    more like "new apparently precombined characters without
    decompositionss" or words to that effect. A different version
    would be whether Unicode "characters

    <JcK> are language and phonetically independent within a script
    (as described in Section 2.2 of the Standard) or the degree to
    which language, phonetic use, and other usage issues that
    cannot be detected by a reader with no context are significant.

RADAR

    [14]https://www.w3.org/International/wiki/Review_radar#Schedule
    d_Last_Call_reviews

      [14] 
https://www.w3.org/International/wiki/Review_radar#Scheduled_Last_Call_reviews

Encoding LC

    <r12a>
    [15]http://www.w3.org/International/docs/encoding/encoding-doc.
    html#issue-26154

      [15] 
http://www.w3.org/International/docs/encoding/encoding-doc.html#issue-26154

    <r12a>
    [16]http://www.w3.org/International/docs/encoding/encoding-doc.
    html

      [16] http://www.w3.org/International/docs/encoding/encoding-doc.html

    (addison summarizes)

    richard: want to reach CR by 12 september
    ... what is outstanding?
    ... working on clarifying, etc.
    ... going through disposition of comments
    ... will help
    ... two sections
    ... first is the LC issues
    ... two raised in bugzilla, rest on winter list
    ... second section is bugzilla items
    ... mostly old
    ... anne not willing to work on the larger items
    ... he sees this as on-going project without specific deadlines
    ... don't know if we can defer

    addison: don't know how we could correct later

    richard: unicode violation text
    ... ken whistler, chief editor, said "none of these are
    violations"
    ... went further and suggested changes
    ... anne will change
    ... another thing causing worry
    ... things are not as clear cut on list of encodings and
    indexes
    ... what characters in what codepoints, etc.
    ... thought it came from survey
    ... of all browsers
    ... but seems to be based on older version of opera (12?)

    <r12a>
    [17]http://www.w3.org/International/tests/repository/encoding/i
    ndexes/results-indexes.en.php

      [17] 
http://www.w3.org/International/tests/repository/encoding/indexes/results-indexes.en.php

    addison: did discuss with Anne, many errors are of the same
    type

    richard: above link is from martins tests which I adapted
    ... for example 1253

    [18]http://www.w3.org/International/tests/repository/run?manife
    st=encoding/indexes&test=windows-1253_test

      [18] 
http://www.w3.org/International/tests/repository/run?manifest=encoding/indexes&test=windows-1253_test

    <r12a> windows 1253

    <r12a> Firefox

    <r12a> 0081 -> FFFD

    <r12a> 0088 -> FFFD

    <r12a> 008A -> FFFD

    <r12a> 008C -> FFFD

    <r12a> 008D -> FFFD

    <r12a> 008E -> FFFD

    <r12a> 008F -> FFFD

    <r12a> 0090 -> FFFD

    <r12a> 0098 -> FFFD

    <r12a> 009A -> FFFD

    <r12a> 009C -> FFFD

    <r12a> 009D -> FFFD

    <r12a> 009E -> FFFD

    <r12a> 009F -> FFFD

    <r12a> Chrome

    <r12a> 00AA -> ª instead of FFFD

    <r12a> IE

    <r12a> 00AA -> ª instead of FFFD

    <r12a> 00D2 -> F8FA instead of FFFD

    <r12a> 00FF -> F8FB instead of FFFD

    <r12a> windows 874

    <r12a> Firefox

    <r12a> 23 mismatches in rows 8 and 9

    <r12a> Chrome

    <r12a> 8 mismatches in rows D and F

    <r12a> IE

    <r12a> same as Chrome

    <r12a> ibm866

    <r12a> Firefox

    <r12a> 001A -> 007F

    addison: need to say what to do and get consensus on moving to
    it or allow variations

    <r12a> 001C -> 001A

    <r12a> 007F -> 001C

    <r12a> Chrome

    <r12a> replacement characters after 0080

    <r12a> IE

    <r12a> same as firefox

    richard: bit of a shock
    ... calls into question label choices

    addison: need to survey the browsers?

    JcK: claims of some of the wilder changes are troubling

    addison: 1252 thing seems well established
    ... multibyte mappings require deeper insight into what is
    actually happening
    ... any actions?

    richard: plh?
    ... one option, not the best, would be to deeply fork

    addison: hard to fork because it forms a system

    richrad: looked for references
    ... mainly sectoin 6
    ... utf-8
    ... a few things
    ... if just going to "normal" CR
    ... could then go back to LC if needed to if tests found issues

    plh: could work, but can't change things ref by html5

    richard: mainly indexes and mainly details
    ... check what css and html actrually need

    addison: more references now than just the big 2

    richard: css cares only about determine encoding, I think
    ... define "indexes and such" as the "registry part"

    addison: if only the registyr part is subject to correction,
    won't cause problems to advance on CR and then correct later

    plh: not a REC, so would work to fix if testing came back
    negative
    ... did we address enough of issues to move to CR

    richard: probably yes

    addison: may know we have some flaws (bugzilla bugs on mbcs)

    plh: put notes in CR version to call attention to that
    ... on specific encodings

    richard: don't know if IE will change as well?

    plh: best way to get attention is to get CR

    richard: obsolete word removal
    ... provided option, but anne didn't like

    addison: 26514 looks like it could be defer

    plh: would agree that it could defer

    addison: violation statement

    "is not relevant"

    <r12a> "For the purposes of specifications using this
    specification, that registry is obsolete"

    <r12a> User agents actually use a subset of the IANA Character
    Sets registry in a particular way. This specification documents
    this to establish interoperability on the Open Web Platform.
    Specifications and applications using this specification must
    restrict themselves to the encodings as documented in this
    specification.

    <r12a> Anne's preference: For the purposes of specifications
    using this specification, that registry no longer relevant.

    addison: could fork and replace or bin that paragraph

    JcK: reuse of lables with different meanings is more
    problematic
    ... hard to tell what an implementation is doing when it sees a
    given label
    ... which standard it follows

    richard: already a real problem

    plh: not looking for convergence... looking for everyone to use
    UTF-8

    <plh> "Authors must use the utf-8 encoding and must use the
    ASCII case-insensitive "utf-8" label to identify it. "

    addison: mention actual goal (UTF-8) in preface?

    <r12a> New protocols and formats, as well as existing formats
    deployed in new contexts, must use the utf-8 encoding
    exclusively. If these protocols and formats need to expose the
    encoding's name or label, they must expose it as "utf-8".

    richard: close remaining
    ... press on with tests
    ... get anne to do edits
    ... and work on false statement
    ... move towards CR

    JcK: refocus the preface on moving to UTF-8 and the rest of
    this is legacy compatibility

    richard: "is no longer relevant"

    <scribe> ACTION: richard: kick off additional discussion with
    anne about preface wording [recorded in
    [19]http://www.w3.org/2014/08/21-i18n-minutes.html#action01]

    <trackbot> Created ACTION-333 - Kick off additional discussion
    with anne about preface wording [on Richard Ishida - due
    2014-08-28].

    JcK: will suggest new text in the next hour

    richard: will look into CR

AOB?

    <r12a> oh, one thing i forgot to mention in the infoshare - we
    published an updated WD of Predefined Counter Styles today !

Summary of Action Items

    [NEW] ACTION: richard: kick off additional discussion with anne
    about preface wording [recorded in
    [20]http://www.w3.org/2014/08/21-i18n-minutes.html#action01]

    [End of minutes]

Received on Wednesday, 27 August 2014 17:41:07 UTC