back to character encoding

Back to character encoding........

(I had some offlist discussions some of which I
didn't realize were offlist until now; I see what
Joe was complaining about.  In addition, different
messages have come from different mailing accounts
of mine. I'm not sure which of what follows I've
said to the list and what privately. Hopefully
this message will re-synch.)

I propose that we resolve the immediate character
encoding issue as follows:

Use the existing character set negotiation
definition,
http://lcweb.loc.gov/z3950/agency/defns/charneg-3.html,

to negotiate utf-8 for the search term.

(i.e. for name strings, which include the search
term).

See the (approved) ZIG Commentary "Negotiating
Unicode and UTF-8"
http://lcweb.loc.gov/z3950/agency/wisdom/unicode.html

"term" is a name string, when characterString is
the CHOICE. See:
http://lcweb.loc.gov/z3950/agency/defns/namestr.html

I don't favor an option bit anymore. I suggested
it because I thought it was going to solve much
more of the problem. We're not going to resolve
what an "option bit" would apply to with respect
to records; it seems we will only agree that it
applies to the term. There isn't any point to
defining an option bit simply for the search term.
It's a heavy-handed approach, and un-necessary,
since character set negotiation will solve the
encoding problem for search terms, which is the
immediate problem. I don't propose defining an
attribute either (which was another suggestion)
since all we're interested in is utf-8.

If people want to solve the encoding problem for
records, we can continue that discussion,
independently.  But I'm not going to revisit it
unless someone claims it as a requirement.

--Ray

Received on Monday, 11 March 2002 12:34:25 UTC