Re: [PLS1.0] i18n comment: Name of grapheme element

Issue R103-24

Proposed Classification: Change to Existing Feature 

Resolution: Reject 

The observation that the element named 'grapheme' [1] almost always
involves a *sequence* of graphemes is quite true. However, it is not a
requirement for the element to contain a *sequence* of graphemes; only
one grapheme (smallest orthographic unit) is permissible (minimum
requirement). This is why the element is named 'grapheme' rather than
'graphemes'. The grapheme or sequence of graphemes given in the
'grapheme' element corresponds to the phoneme or sequence of phonemes
given in the 'phoneme' element. This is in accordance with the notion of
"grapheme-to-phoneme conversion" (or, in layman's terms, letter-to-sound
conversion). The name of the element 'grapheme' goes hand-in-hand with
the name of the element 'phoneme', which has been borrowed from SSML 1.0
[1] because it has a similar usage. 

Future revisions of PLS may wish to define the pronunciation of
orthographic units larger than the grapheme, such as 'morpheme' or
'affix' (as is common in system internal lexicons). Grapheme, morpheme,
affix, locution... are all terms that refer to orthographic units. A
generic term such as 'text' or 'phrase' for this element seems
inappropriate at this stage given that it would probably have to be
changed to 'grapheme' in future. 

It is thus our opinion that the current name 'grapheme' is the best name
for this element. 

Please indicate whether you are satisfied with the VBWG's resolution,
whether you think there has been a misunderstanding, or whether you wish
to register an objection. 


Paolo Baggia, editor PLS spec.

From: <> 
Date: Tue, 21 Mar 2006 17:48:47 +0000
Message-Id: <> 

Comment from the i18n review of:

Comment 24
Editorial/substantive: S
Owner: RI

Location in reviewed document:

In the glossary of terms you define 'grapheme' as "One of the set of the
smallest units of a written language, such as letters, ideograms, or
symbols, that distinguish one word from another; a representation of a
single orthographic element." but then you use it as an element name to
label content that almost always involves a *sequence* of graphemes.

Please find a better name for the element. How about 'text' or 'phrase'

