W3C home > Mailing lists > Public > www-zig@w3.org > March 2002

RE: Octet Strings and utf-8

From: LeVan,Ralph <levan@oclc.org>
Date: Thu, 14 Mar 2002 13:04:37 -0500
Message-ID: <E5431CF93E29F9478878F623E5B9CE9802DD9222@OA3-SERVER.oa.oclc.org>
To: "'Ray Denenberg'" <rden@loc.gov>, www-zig@w3.org
Let's ask the easier question.  Is anyone sending binary data as a general
Term?  If so, would you share the particulars with us?

Thanks!

Ralph

> -----Original Message-----
> From: Ray Denenberg [mailto:rden@loc.gov]
> Sent: Thursday, March 14, 2002 11:37 AM
> To: www-zig@w3.org
> Subject: Re: Octet Strings and utf-8
> 
> 
> "LeVan,Ralph" wrote:
> 
> > Somehow I must be deciding if the term is binary, because I 
> am sending those
> > terms to a search engine.  The search engine is not 
> expecting binary data.
> 
> If you have a search engine where binary data isn't 
> applicable, and you've
> negotiated utf-8 (via character set negotiation), and you're 
> using version 2, so
> the client has no choice but to send a term via octet string, 
> then you might
> argue that arbitrarily extending the negotiation to apply to 
> octet-string-tagged
> search terms is a reasonable and pragmatic thing to do.
> 
> Still there is some winking going on, since the client could 
> only know via
> out-of-band agreement that your search engine doesn't expect 
> binary.  It could
> be that the search was on title, author, etc. so a binary 
> term wouldn't make
> sense.
> 
> Would someone care to suggest some reliable rule of thumb we 
> can adopt --
> perhaps  based on access point, for example, that if we're 
> searching on title,
> author, subject .... -- that an octet-string-tagged term is 
> guaranteed to be
> text and not binary?
> 
> --Ray
> 
> 
Received on Thursday, 14 March 2002 13:05:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 29 October 2009 06:12:22 GMT