phonetic markup


I've written a short proposal on mechanism to present phonetic
information.  This has been concern of Japanese blind computer users'
community.  I hope this to be taken into consideration as we discuss
phonetic markup on the list and/or during upcoming meeting(s).

I appreciate any comments, although I won't be able to respond to them 
too promptly as I'm leaving for Austin in about 12 hours.


          Masafumi NAKANE, Keio Univ., Dept. of Environmental Information
E-Mail : / max@FreeBSD.ORG
[URL] :

This memo describes the necessity of phonetic markup in HTML from
standpoint of accessibility.


There should be some mean for web page authors to convey pronunciation
of words to so called self-voicing web browsers and to users of
visual user agents.


With some languages, there are some occasions where it is impossible to
determine the pronunciation of words/phrases.  Many of these cases can
be solved with proper context analysis while rest of the cases allows
no one but the author to determine the pronunciation.  Furthermore, the
former may be solved with technical improvement in near future, it is
difficult to have good context analysis at present.  The latter
probably will never be solved.  These facts raise necessity of
mechanism for web page authors to convey the phonetic information to
the users.


With printed Japanese, the language is represented using mixture of
ideographic characters called kanji and phonographic characters called
kana.  In Japanese braille, only kana is used.  Thus, braille
translation process requires conversion of kanji into phonetic form.
This is also true for the speech output.

Most of the kanji text can be translated into kana without much
difficulty if the user agent, or maybe access agent have good
dictionary.  However, there are many cases where it is impossible for
readers to determine the pronunciation of certain combination of
kanji characters.  There are even characters of which the
pronunciation cannot be determined from how and/or where they are
used.  This is common case with proper nouns.

In order to convey correct phonetic representation of kanji, mechanism
to convey that information is mandatory.


An attribute for this purpose should be added to the SPAN element.
(I expect an appropriate attribute name to be chosen as we discuss.
In this document, however, I use PHONETIC for the convenience.)
This attribute takes a character string as its value.  The string should 
describe the pronunciation of the word inside the SPAN element.


I <SPAN phonetic="red">read</SPAN> the book. --- [1]

<SPAN lang="ja" phonetic="higashi">HIGASHI</SPAN> --- [2a]
<SPAN lang="ja" phonetic="azuma">HIGASHI</SPAN> --- [2b]

The example [1] is obvious.  This can adjust the way self-voicing UA
reads the text.

The examples [2a] and [2b] need some explanation.  Assume that
``HIGASHI'' is one kanji character.  This character has several
different pronunciation and ``azuma'' and ``higashi'' are two of them.
If this character is used as people's last name, it is impossible for
anyone but the person who owns the name to determine how this
character should be read.  In these examples, assume lowercase
characters represent kana and uppercase characters represent kanji.


Cases like the example [1] is simple.  Self-voicing UA should just use 
the value of the attribute to adjust the speech output.

Consideration must be taken when processing languages like Japanese.
UA with voice and/or braille output can use this information.
However, it is questionable how this should be treated in visual
browsers.  In printed Japanese material, kana to represent the
pronunciation of kanji character(s) is put beside the kanji in smaller
font when the pronunciation needs to be made clear.  The simplest
implementation would be unconditionally present the value of the
PHONETIC attribute using font in appropriate size if the LANG is
``ja'' (or other lanugages that have the same convention).

However, this can lead misuse/abuse of the attribute.  In Japanese
print, there is something called rubi whose original purpose was to
put phonetic representation of kanji.  Rubi is usually written beside
the corresponding kanji in smaller font.  In spite of the original
purpose, kanji is put in the place of rubi using the font whose size
is the same as one used for rubi.  This is common practice in Japanese 
literature.  With this fact in mind, it is easy to imagine that there
would be more than just a few people who would use the PHONETIC
attribute for presentation purpose if the UA shows the content of the
attribute like for rubi.


How is it possible to limit the characters to be used for this
attribute to the phonographic characters of the language inside the

What method, character set, etc. should be used to represent the
pronunciation?  Kana probably is the best for Japanese, but what about 
other languages?

Is adding this attribute to the SPAN element enough?  Don't any other
inline elements need this?

Is this the best way anyway?


Following Internet draft discusses the similar issue:

<draft-duerst-ruby-01.txt>                        University of Zurich
  Martin J. Duerst
    Ruby in the Hypertext Markup Language

Received on Friday, 7 November 1997 09:10:07 UTC