Re: Octet Strings and utf-8

"LeVan,Ralph" wrote:

> Somehow I must be deciding if the term is binary, because I am sending those
> terms to a search engine.  The search engine is not expecting binary data.

If you have a search engine where binary data isn't applicable, and you've
negotiated utf-8 (via character set negotiation), and you're using version 2, so
the client has no choice but to send a term via octet string, then you might
argue that arbitrarily extending the negotiation to apply to octet-string-tagged
search terms is a reasonable and pragmatic thing to do.

Still there is some winking going on, since the client could only know via
out-of-band agreement that your search engine doesn't expect binary.  It could
be that the search was on title, author, etc. so a binary term wouldn't make
sense.

Would someone care to suggest some reliable rule of thumb we can adopt --
perhaps  based on access point, for example, that if we're searching on title,
author, subject .... -- that an octet-string-tagged term is guaranteed to be
text and not binary?

--Ray

Received on Thursday, 14 March 2002 11:36:47 UTC