- From: Robert Waldstein <wald@library.ho.lucent.com>
- Date: Thu, 7 Jun 2001 08:44:42 -0400
- To: www-zig@w3.org
> In our next release of SiteSearch, we will be supporting the explicit
> negotiation of characterset. Specifically, we will allow the client to
> negotiate the use of UTF-8 in searches. With that comes the requirement
> that we convert the UTF-8 query into the correct characterset for the
> database being searched. We need a diagnostic when the query includes
> characters that do not translate into the target characterset. The addinfo
> field will contain the character (in the negotiated characterset) that could
> not be translated.
>
> We recommend a practice to other implementors of not ignoring illegal
> characters. Profiles are currently asking us to not ignore or misinterpret
> attributes and I suspect that they will eventually ask us to treat the
> user's query terms with as much respect.
Ralph, I agree with the general view of your message (even when my
implementations don't do it -)); but have a question on
> characters that do not translate into the target characterset.
QUery by example:
- So does a with umlaut translate to a
- dipthong (ae) translate to ae (a followed by e)
- oneHalf (1 over 2) translate to 1/2
- a superscript 2 translate to a 2
- captital A translate to "a" (guess we handle this with an
attribute, do we do the others?)
Guess I am asking who controls the translation - decides what does not
translate?
thanks,
Bob Waldstein wald@lucent.com
Received on Thursday, 7 June 2001 08:44:23 UTC