W3C home > Mailing lists > Public > public-i18n-ws@w3.org > June 2004

[Fwd: 4.16 Transports]

From: A. Vine <andrea.vine@sun.com>
Date: Thu, 10 Jun 2004 16:00:44 -0700
To: I18n WSTF <public-i18n-ws@w3.org>
Message-id: <40C8E81C.2000609@sun.com>

-------- Original Message --------
Subject: 4.16 Transports
Resent-Date: Wed, 14 Apr 2004 14:33:05 -0400 (EDT)
Resent-From: public-i18n-ws@w3.org
Date: Wed, 14 Apr 2004 11:12:10 -0700
From: A. Vine <andrea.vine@Sun.COM>
Reply-To: andrea.vine@Sun.COM
Organization: Sun Microsystems
To: I18n WSTF <public-i18n-ws@w3.org>

Lots of modification should happen to this section, but I am not the right
person to make them all.
Someone at the F2F (Tex maybe?) had a lot more relevant information on FTP than
I do.  I'm happy to write it up if you give me the info or a pointer.

I am not very happy with my intro - comments, changes, rewrites welcome.

I think 4.16.4 "IRIs, URIs, and fun stuff" would be better written by Martin.  I
could take a shot at the "fun stuff" part ;-}

4.16 Transports

Web services may use a variety of transport technologies and protocols.  Many of
these have parameters defined for data identification.  These parameters are
necessary for proper processing of international data.  The specifics of several
transport protocols are discussed in this section.

4.16.1 HTTP Accept-Language
{Andrea's note: what about Content-Language?  And a blurb on Content-Type is
added below.  Maybe we should call this section "HTTP".}

The HyperText Transport Protocol (HTTP) is often used for Web service message
transport.  HTTP contains some header fields which are useful for identifying
sender preferences and capabilities.  One of those fields is Accept-Language.

Accept-Language takes one or more language identifiers in RFC3066 (or its
replacement) format as its parameters.  Each language identifier can have a
quality value which gives a relative priority.  Here is an example:

{Andrea's note:  the below should be set off in the example format, or indented
and in a different font, or something}
Accept-Language: zh-cn, fr-ch;q=0.8, fr;q=0.7

The above could be read as "Simplified Chinese is preferred, but Swiss French is
acceptable, as are other types of French."  There is more information about the
handling of Accept-Language in the HTTP 1.1 specification.

A Web service requester using HTTP can include an Accept-Language field to
indicate the languages preferred.  The provider can then take that information
and use it to return human-readable data in the appropriate language.

{Andrea's note: I put the below paragraph in because I think it bears mentioning}
The charset of the data can also be specified as a parameter of the Content-Type
header.  However, it is better to specify the SOAP document charset inside the
document itself, rather than to rely on the transport mechanism to be the sole
mechanism for identifying the charset.  If the charset specified in HTTP doesn't
agree with the charset inside the document, then the receiver must make a
decision on how to resolve the problem.

4.16.2 FTP

File Transfer Protocol (FTP) is a simple transport mechanism that can be used
for Web service documents.  The main international consideration in using FTP is
to specify the representation type as I (Image), allowing 8-bit values to pass
unchanged through the transfer.

File names, path names, and character encoding issues may intrude here.

4.16.3 SMTP

Simple Mail Transfer Protocol (SMTP) has no particular provisions for
international data.  SMTP itself is limited to 7-bit data, but can transport
8-bit data.  Its main restriction is an 8-bit gateway; that is, encodings such
as UTF-16 and UTF-32 may not be successfully transmitted and should be avoided. MIME Tags

Multipurpose Internet Mail Extensions (MIME) tags are necessary for a multipart
SOAP request, for example, a SOAP message with an attachment.  MIME contains a
number of headers which may be used for international data.

{somewhere we need to reference RFCs 2045-9, maybe as a MIME reference in the
reference section}

MIME can be useful for identifying the charset of attachments which do not
identify their own charset inside the attachment.  Examples of such attachments
are plain text documents which cannot contain a charset tag and legacy markup
documents which do not contain a charset tag by omission.  If the attachment
contains an internal charset tag, the MIME charset parameter should be omitted
to avoid an inadvertent mismatch.

MIME can also contain a Content-Language tag.  While it is better to indicate
the document language inside the document itself, sometimes it isn't possible.
For example, if there is an image attachment which contains embedded text, the
Content-Language header can provide the language id.

See the example in 4.5.2 Character Coding of Attachments.
{Andrea's note:  add the following line to the attachments example in 4.5.2 in
the attachment MIME headers after Content-Type -

Content-Language: fr

This avoids having to make a separate example which would be very similar.}

4.16.4 IRIs, URIs, and fun stuff
{Martin's text here :-) }

I have always wished that my computer would be as easy to use as my telephone. 
My wish has come true. I no longer know how to use my telephone.
-Bjarne Stroustrup, designer of C++ programming language (1950- )
Received on Thursday, 10 June 2004 19:14:33 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:02:39 UTC