RE: call for comments: Character Model Part II: Normalization--comments on punctuation/grammar from CE Whitehead on 2008-07-09 (www-international@w3.org from July to September 2008)

From: CE Whitehead <cewcathar@hotmail.com>
Date: Wed, 9 Jul 2008 19:12:58 -0400
To: "Phillips, Addison" <addison@amazon.com>, "www-international@w3.org" <www-international@w3.org>
CC: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <BLU109-W3565D8D39172915136A1EDB3960@phx.gbl>
Below are my remaining comments on the document, "Character Model for the World Wide Web" (http://www.w3.org/TR/charmod/)!
* * *
USE of word "SPECIFICATION":  I guess it's o.k. in most cases
 
ABSTRACT; Par 1--"Specification" is the right word.
 
1.1; Par 2
Using "specification" to refer to this document is needed both times.
 
1.1; Par 3
Again, I think "specification" is the needed term.
 
1.1; Par 4
 
"The character model described in this specification provides authors of specifications . . . "
> "The character model described in this document provides authors of specifications . . . "
 
COMMENT:  I prefer "document" here--it's up to you!
 
* * *
 
Elsewhere--you can leave as is or replace with "document" as you like:
 
1.2; par 2
1.2; second to last par
2; par 3
2; last par
 
SECTION 1.3 
Insert at the end of this section: >  
 
"Terminology related to the ways characters are perceived (phonemes, graphemes, collation units) is discussed in section 3, below.
"Terminology related to the ways characters are encoded (character repertoires, code points, character encoding forms, character encoding schemes, transcodings, character escapes), is discussed in section 4, below."
COMMENT:  This section really just deals with Notation not Terminology; but you're right; it should deal with both!   I inserted two cross-references to where you do deal with terminology!I'd also love a note explaining the term text, as I've seen text (in linguistics and in research, not computing) refer to written text, spoken text (not written in any way, such as tape recordings), visual text (including paintings, photos), and archaelogical text (artifacts). Halliday and his followers use text to refer to both written and spoken language. You mean it to refer to written or visually represented linguistic text that is part of computer page.
 
* * *
SECTION 3; Title > "Perceptions of Characters:  Phonemes, Graphemes, and Collation Units"
COMMENT: My change makes the title a bit more specific; it's up to you.
 * * *
 
SECTION 4; Title  
COMMENT:  I'd sort of like a more specific title for this section as well; again, it's up to you.* * *
 SECTION 4.1; Title >  "Character encoding terminology" COMMENT:  Again, I'd like a more specific title, to guide the reader; but again, leave it up to you.  
* * *
SECTION 4.2; Title> "Transcoding:  Translating between different encodings"
 
COMMENT: Again, I'd like a more specific title. 
 
* * *
 SECTION 6.1
 
COMMENT:  I note that the first section on the "Byte String" has the Example set off, and the word, "EXAMPLE," boldfaced all upper case letters.  Is there any reason not to set off the remaining examples (for the "Code Unit String" and the "Character String")
 
* * *
 
Again, SECTION 6.1;Also regarding the "Byte String" "EXAMPLE" paragraph:
 
"This is a counter-example, illustrating one reason why considering strings as byte strings may be problematic." 
 
>  "The following example illustrates one reason why considering strings as byte strings may be problematic."
 
COMMENT:  I don't think you need to say "counter example"; that just adds unecessary words.
SECTION 6.2; paragraph 1, last sentence or two - paragraph 2 
 
?? REORDER ??
 
"The requirements for string indexing are discussed in Requirements for String Identity Matching [CharReq], section 4. The two main questions that arise are: "What is the unit of counting?" and 'Do we start counting at 0 or 1?'.
"The example in the previous section, 6.1 String concepts, shows a string viewed as a character string, code unit string and byte string, respectively, each of which involves different units for indexing."

 

>
 
"The requirements for string indexing are discussed in Requirements for String Identity Matching [CharReq], section 4. 
 
"The example in the previous section, 6.1 String concepts, shows a string viewed as a character string, code unit string and byte string, respectively, each of which involves different units for indexing.
 
"The two main questions that arise are: "What is the unit of counting?" and "Do we start counting at 0 or 1?"."
 
COMMENT:  I'd reorder this slightly; I think that you should summarize 6.1 before introducing the two main questions; that way you go from the questions to answering them!  * * * MORE PROORFREADING * * *SECTION 4.1Numbered item 3, 1rst sentence "which encodes the abstract integers of a coded character set (CCS) into sequences of the code units of the base datatype" >"which encodes the abstract integers of a coded character set (CCS) into sequences of code units of the base datatype" COMMENT:"the" is not needed here and is confusing --because you've not mentioned the code units of the base datatype before. 
* * *
 
SECTION 4.41; paragraph 3; 1rst NOTE
 
"The IETF Charset Policy [RFC 2277] specifies that on the Internet "Protocols MUST be able to use the UTF-8 charset".
 
> "The IETF Charset Policy [RFC 2277] specifies that, on the Internet, "Protocols MUST be able to use the UTF-8 charset".
 
COMMENT:  I inserted commas after "that" and "Internet"--the commas make this a bit clearer!* * *
 SECTION 4.5 ". . . SHOULD NOT use codepoints in the private use area" > " . . . SHOULD NOT use codepoints designated for private use" COMMENT:  original confusing and perhaps ambiguously worded
 
* * *
That is it for my comments!
 
Sincerely,
 
C. E. Whitehead
cewcathar@hotmail.com
> From: cewcathar@hotmail.com>> > > > Dear Internationalization:> > I want to comment on the draft that Addison suggested we comment on;> "Character Model for the World Wide Web" (http://www.w3.org/TR/charmod/)> (hope it's not too late; I was busy studying for the CCNA so sorry I could not risk that exam--it cost $125--by doing anything else)> > > 1.1, par 1, 2nd sentence> > > "One basic prerequisite to achieve this goal is to be able to transmit and process the characters . . . "> > > "One basic prerequisite to achieving this goal is being able to transmit and process the characters . . . "> > COMMENT:> > the phrase is> "prerequisite to DO-ING [something]"> you need to use an -ing verb here.> > * * *> 1.2, par 7> > "While these developments strengthen the requirement that Unicode be the basis of a character model . . . "> > > "While these developments advance the need to use Unicode as the basis of a character model . . . "> > ??> > or ??> > > "While these developments advance the need to have a character model . . . based on Unicode . . . "> > COMMENT:> I like the latter sentence.> I think that the phrase, "strengthen the requirement", is awkward, not quite the right word choice; "advance the need" is better and has two less syllables! The overall sentence sounds a little awkward as well, so I rewrote it.> > * * *> > 1.2, par 8> > "It should be noted that such aspects also exist in various encodings . . . "> > > "It should be noted that such aspects also exist in various other encodings . . ."> > COMMENT:> I inserted "other" before "encodings" because you used "also" you should use other;> alternately:> > "It should be noted that such aspects exist in various encodings"> (without also)> > * * *> > 1.2, last par, 2nd sentence> > "The policies adopted by the IETF for on the use of character sets . . . "> > > "The policies adopted by the IETF for the use of character sets . . . "> > COMMENT:> > you've got an extra preposition there; you only need one!> > * * *> 1.3> > "Terminology and Notation"> > ?? How does this short section really address terminology?? You address terminology all over the place but not quite here. {I'll have some more comments on this, with suggestions for a cross-reference, shortly!}> > * * *> > 1.3, 2nd par> > "Text has been used for examples to allow them to be cut and pasted by the reader."> > >"The examples are all text, so that they can be cut and pasted by the reader."> > COMMENT:> > Your sentence was awkward, not quite English--so I tried to rewrite it; hope my rewrite helps.> > > {I'll comment shortly on a few cross-references I'd like to see and also on whether to use "specification" or "recommendation" to refer to this document inside the document--I note that you also use "specification" to talk about documents that will be created by the reader, making your use of the term a bit confusing.}> > --C. E. Whitehead> cewcathar@hotmail.com> > > > > > > From: addison@amazon.com> > Date: Wed, 18 Jun 2008 06:53:36 -0700> > >> > All,> >> > One of the important things the W3C Internationalization Activity has been working on for many years is the "Character Model for the World Wide Web" (better known as "CharMod"). In 2005, the then Internationalization Core Working Group published Part I of this work [1] as a Recommendation. Two other parts, one on Normalization [2] and one on Resource Identifiers [3] remain at lower levels of maturity.> >> > The Normalization document, in particular, is important. In fact, it is the primary reason for CharMod work in the first place. "CharMod-Norm" deals with Unicode normalization in W3C technologies. Originally, when this work was started well over a decade ago, this working group supported mandating "early uniform normalization". In 2004 the working group decided that such a position was no longer tenable. The current draft for CharMod-Norm represents this change.> >> > The Internationalization Core WG would like to complete this work under our current charter, publishing the document as a Working Group Note. Because Notes have a lower level of scrutiny than Recommendation track documents do, we would like to invite the community (that's YOU) to comment on the working draft and participate in the finalization of this important work.> >> > Please submit comments on the draft and its changes here on the interest group list (www-international@). You may also send comments to www-i18n-comments@w3.org.> >> > [1] http://www.w3.org/TR/charmod/> > [2] http://www.w3.org/TR/charmod-norm/> > [3] http://www.w3.org/TR/charmod-resid/> >> > Best Regards,> >> > Addison> >> > Addison Phillips> > Chair, W3C Internationalization Core WG> >> > Internationalization is not a feature.> > It is an architecture.> >> >> > >
Received on Wednesday, 9 July 2008 23:13:42 UTC