Re: [CSS3 Text] punctuation-trim

On Mon, 05 Feb 2007 18:04:02 +1300
fantasai <fantasai.lists@inkedblade.net> wrote:

> > The punctuation list in my proposal was not complete. The following is a
> > revised version. 
> > 
> >       (Japanese)
> >         Fullwidth opening punctuations:   「『(‘“〔[{〈《【〝〖〘〚⦅«
> >         Fullwidth closing punctuations:    」』)’”〕]}〉》】〟〗〙〛⦆»。.、,
> >         Fullwidth middle dot punctuations: ・
> 
> I noticed you left out the colon and semicolon this time. Was that intentional?
> (Also, I assume the character between the filled and open closing brackets is
> the double prime? It looks like kanji on my screen for some reason...)

I wrote colon and semicolon. Disappeared?
Well, I have made this list more clearly with some correction:

Fullwidth opening punctuations:
  U+FF08  FULLWIDTH LEFT PARENTHESIS                    (
  U+2018  LEFT SINGLE QUOTATION MARK                    ‘
  U+201C  LEFT DOUBLE QUOTATION MARK                    “
  U+FF3B  FULLWIDTH LEFT SQUARE BRACKET                 [
  U+FF5B  FULLWIDTH LEFT CURLY BRACKET                  {
  U+3008  LEFT ANGLE BRACKET                            〈
  U+300A  LEFT DOUBLE ANGLE BRACKET                     《
  U+300C  LEFT CORNER BRACKET                           「
  U+300E  LEFT WHITE CORNER BRACKET                     『
  U+3010  LEFT BLACK LENTICULAR BRACKET                 【
  U+3014  LEFT TORTOISE SHELL BRACKET                   〔
  U+3016  LEFT WHITE LENTICULAR BRACKET                 〖
  U+3018  LEFT WHITE TORTOISE SHELL BRACKET             〘
  U+301A  LEFT WHITE SQUARE BRACKET                     〚
  U+FF5F  FULLWIDTH LEFT WHITE PARENTHESIS              ⦅
  U+301D  REVERSED DOUBLE PRIME QUOTATION MARK          〝

Fullwidth closing punctuations:
  U+FF09  FULLWIDTH RIGHT PARENTHESIS                   )
  U+2019  RIGHT SINGLE QUOTATION MARK                   ’
  U+201D  RIGHT DOUBLE QUOTATION MARK                   ”
  U+FF3D  FULLWIDTH RIGHT SQUARE BRACKET                ]
  U+FF5D  FULLWIDTH RIGHT CURLY BRACKET                 }
  U+3009  RIGHT ANGLE BRACKET                           〉
  U+300B  RIGHT DOUBLE ANGLE BRACKET                    》
  U+300D  RIGHT CORNER BRACKET                          」
  U+300F  RIGHT WHITE CORNER BRACKET                    』
  U+3011  RIGHT BLACK LENTICULAR BRACKET                】
  U+3015  RIGHT TORTOISE SHELL BRACKET                  〕
  U+3017  RIGHT WHITE LENTICULAR BRACKET                〗
  U+3019  RIGHT WHITE TORTOISE SHELL BRACKET            〙
  U+301B  RIGHT WHITE SQUARE BRACKET                    〛
  U+FF60  FULLWIDTH RIGHT WHITE PARENTHESIS             ⦆
  U+301E  DOUBLE PRIME QUOTATION MARK                   〞
  U+301F  LOW DOUBLE PRIME QUOTATION MARK               〟
  U+3001  IDEOGRAPHIC COMMA                             、
  U+FF0C  FULLWIDTH COMMA                               ,
  U+3002  IDEOGRAPHIC FULL STOP                         。
  U+FF0E  FULLWIDTH FULL STOP                           .

Fullwidth middle dot punctuations:
  U+30FB  KATAKANA MIDDLE DOT                           ・
  U+FF1A  FULLWIDTH COLON                               :
  U+FF1B  FULLWIDTH SEMICOLON                           ;

NOTE:
- This list is based on JIS X 4051:2004 except U+301A, U+301B and U+301E
  that are not in JIS standard but are occasionally used in Japanese
  text and certain Japanese fonts have those characters.
- In JIS X 4051:2004, the character code is not Unicode but JIS X 0213.
  The above list shows corresponding Unicode codepoints.
- In JIS X 4051:2004, full stop characters (U+3002 '。', U+FF0E '.')
  are distinguished from other closing punctuations. I merged them for
  simplification.
- I've removed the double angle quotation marks (U+00AB '«', U+00BB '»') 
  because they are not treated as fullwidth characters.

> > When punctuations typically only used for one language appear in another
> > language text, punctuation trimming is not expected.
> 
> Given that in web pages the language is often unmarked, and just generally
> to make mixed-language documents format more consistently, I think punctuation
> specific to one of these languages should appear in the corresponding list
> for the other languages as well. Do you feel that that would cause any
> significant problems?

I don't feel that would cause significant problem, but I'm not sure...

-- 
Shinyu Murakami
Antenna House XSL Formatter team
http://www.antennahouse.com

Received on Monday, 5 February 2007 13:30:26 UTC