Re: Checklists for reviews from Mark Davis on 2007-04-25 (public-i18n-its@w3.org from April to June 2007)

From: Mark Davis <mark.davis@icu-project.org>
Date: Wed, 25 Apr 2007 09:39:03 -0700
To: "Richard Ishida" <ishida@w3.org>
Cc: public-i18n-core@w3.org, public-i18n-its@w3.org
Message-ID: <30b660a20704250939y30ad2c48ifa494c202b000fca@mail.gmail.com>
That sounds like a good idea. Some comments below. I know that these are
just off the top of your head, so I'm not trying to be nit-picky at this
point, since the main point is whether the overall idea is good, but just
giving off-the-top-of-my-head responses.

On 4/25/07, Richard Ishida <ishida@w3.org> wrote:
>
>
> A comment made during yesterday's Core telecon set me thinking, as I was
> reading the EMMA specification today.
>
> Often we are making similar comments on specifications because we need to
> ensure that people consider the same issues each time.  Charmod{1} was
> written in an attempt to allow spec developers in other WGs to know what we
> would be looking for before we reviewed their spec - at least in the area of
> characters and encodings - before we reached the LastCall phase.
>
> I've long wanted to better document the types of thing spec writers should
> look for, so that we can reduce the burden of spec reviews and build things
> in to specs rather than retrofit them.
>
> I'd like to encourage the group to try to draw out general best practices
> while reviewing documents and compile them somewhere.  Note that the ITS XML
> best practices document has the potential to effect some of this, although
> they are focused specifically on the audiences of XML schema developers and
> content authors, rather than W3C spec developers, so they have a slightly
> different remit.  There are other obvious areas where this would be helpful
> too, such as time zone and duration handling, etc.
>
> These best practices can be used not only to educate spec developers, but
> also as checklists during reviews.
>
> To kick things off I'm thinking of doing two things:
>
> [1] listing relevant conformance criteria from charmod fundamentals in a
> form like http://www.w3.org/International/techniques/charset linked to
> from a techniques index entry for spec developers considering character
> encoding (where each bp links to the charmod doc for further information)
>
> [2] making a beginning on other lists such as follows
>
> I'm thinking of making these lists very informal to begin with.  We can
> then formalise them further by producing Notes at some later date when time
> allows.
>
> What do you think?
>
>
>
> Here are some quick thoughts I pulled from the top of my head for best
> practices related to language declaration and bidi.  (These may also be of
> use to the ITS effort.)
>

General comment. A lot of these are written as if the document is a
"document" as the term is normally used by lay people. A key, and perhaps
predominant, use of XML is general structured data, which may or may not
represent a document; it may just be communicating some data, more like a
fragment of a database. Example:

<location>
  <latitude>37.529</latitude>
  <longitude>-25.97</longitude>
</location.

It should *not* be a requirement to have language or bidi stuff anywhere in
this structure.


=================
> Language declaration


See general comment. There are certainly circumstances where this is not at
all applicable.

it must be possible to declare the default language of text on the root
> element
>
> it must be possible to declare changes in language at any point in the
> document (this may necessitate the use of a span-like element)


This is a bit too strong. If an element doesn't contain linguistic data, you
don't need to be able to declare a language. Maybe something like:

It must be able to declare changes in language over any range of text, or
other linguistic content (eg binary image containing visible text, or binary
audio containing spoken language) in the document.

language values must conform to BCP 47


should be first item

xml:lang should be used to declare the language of text in XML formats
> whenever possible
>
> language information that does not describe the text in the document
> should not use xml:lang


why is this necessarily the case? Not arguing, but wondering why.

it must be possible to use xml:lang="", or a similar construct if xml:lang
> is not used, to indicate that language information is undefined for a range
> of text within a document


(need to define 'similar construct': does that mean or include
xml:lang=':und"?

absence of a language declaration on the root of a document should mean that
> language information is undefined for the document in question


Use "not supplied" instead of undefined.

Bidi
>
> it should be possible to indicate that the default directionality of a
> document is right to left in the root element


For documents, expected to be displayed, this is true. For general
structured data, no.

the default directionality for a document should be left-to-right


should be set to be? (that is, the specification for the structure of the
document should say that the default is LTR)

it should be possible to indicate changes in directional context for any
> range of text in the document (this may necessitate the use of a span-like
> element), and allowing for ltr, rtl, lro, rlo
>
> it should be possible to use &rlo; and &lro; to represent Unicode
> characters ... and ... respectively
>
> the Unicode characters RLE, LRE, RLO, LRO and PDF should not be used or
> required - markup should be available
>
> =================
>
> RI
>
>
>
> {1} http://www.w3.org/TR/charmod/
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://people.w3.org/rishida/blog/
> http://www.flickr.com/photos/ishida/
>
>
>
>


-- 
Mark
Received on Wednesday, 25 April 2007 16:39:10 UTC