W3C home > Mailing lists > Public > uri@w3.org > March 2006

RE: submission of "draft-wilde-sms-uri-registration-00", "draft- wilde-sms-uri-12", and "draft-wilde-sms-service-12"

From: McDonald, Ira <imcdonald@sharplabs.com>
Date: Mon, 20 Mar 2006 09:37:37 -0800
Message-ID: <CFEE79A465B35C4385389BA5866BEDF00C7FC3@mailsrvnt02.enet.sharplabs.com>
To: "'Erik Wilde'" <net.dret@dret.net>, uri@w3.org
Cc: Tim Kindberg <timothy@hpl.hp.com>, uri-review@ietf.org, uri@w3.org, Bennett.Marks@nokia.com, Ted Hardie <hardie@qualcomm.com>, Antti Vähä-Sipilä <antti.vaha-sipila@nokia.com>, Basavaraj.Patil@nokia.com, Markus.Isomaki@nokia.com, alastair_angwin@UK.IBM.COM, Ileana.Leuca@cingular.com, martti.ala-rantala@nokia.com

Erik Wilde wrote on Monday 20 March 2006:
> > http://dret.net/netdret/docs/draft-wilde-sms-uri-12.txt says:
> >> Implementations MAY choose to silently discard (or 
> convert) characters in  >the sms-body that are not supported 
> by the SMS character set they are using
> >> to send the SMS message.
> > Depending on language this can be a very bad thing, as 
> removing a diacritical here or even dropping a letter there 
> can change the meaning of a message.  The user would not want 
> this done silently, so the implementations SHOULD if possible 
> ask for confirmation.
> you are right that silently discarding or converting 
> characters is not a 
> good idea. i have changed the sentence to:
> "Implementations MAY choose to discard (or convert) characters in the 
> sms-body that are not supported by the SMS character set they 
> are using 
> to send the SMS message. If they do discard or convert 
> characters, they 
> MUST notify te user."

The above still needs a little more improvement, I think. 

There are three possible things to do with an unsupported character:

(1) Discard - warning notice SHOULD be added to the message body;

(2) Convert - replace with some (near) match supported character;

(3) Substitute - replace with a fixed 'substitution character'
    such as '?' or the Unicode REPLACEMENT CHARACTER '0xFFFE'.

Also (for multibyte charsets), normalization of the character
stream is needed to detect and process 'whole' characters (e.g., 
a decomposed base character plus its diacritical marks) that may
be represented by more than one octet in the stream.

- Ira

Ira McDonald (Musician / Software Architect)
Blue Roof Music / High North Inc
PO Box 221  Grand Marais, MI  49839
phone: +1-906-494-2434
email: imcdonald@sharplabs.com

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.2.5/284 - Release Date: 3/17/2006
Received on Monday, 20 March 2006 17:46:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:10 UTC