4.16 Transports

Lots of modification should happen to this section, but I am not the right 
person to make them all.
Someone at the F2F (Tex maybe?) had a lot more relevant information on FTP than 
I do.  I'm happy to write it up if you give me the info or a pointer.

I am not very happy with my intro - comments, changes, rewrites welcome.

I think 4.16.4 "IRIs, URIs, and fun stuff" would be better written by Martin.  I 
could take a shot at the "fun stuff" part ;-}

--------------------------------------------------------
4.16 Transports

Web services may use a variety of transport technologies and protocols.  Many of 
these have parameters defined for data identification.  These parameters are 
necessary for proper processing of international data.  The specifics of several 
transport protocols are discussed in this section.

4.16.1 HTTP Accept-Language
{Andrea's note: what about Content-Language?  And a blurb on Content-Type is 
added below.  Maybe we should call this section "HTTP".}

The HyperText Transport Protocol (HTTP) is often used for Web service message
transport.  HTTP contains some header fields which are useful for identifying
sender preferences and capabilities.  One of those fields is Accept-Language.

Accept-Language takes one or more language identifiers in RFC3066 (or its
replacement) format as its parameters.  Each language identifier can have a
quality value which gives a relative priority.  Here is an example:

{Andrea's note:  the below should be set off in the example format, or indented 
and in a different font, or something}
Accept-Language: zh-cn, fr-ch;q=0.8, fr;q=0.7

The above could be read as "Simplified Chinese is preferred, but Swiss French is
acceptable, as are other types of French."  There is more information about the
handling of Accept-Language in the HTTP 1.1 specification.

A Web service requester using HTTP can include an Accept-Language field to
indicate the languages preferred.  The provider can then take that information
and use it to return human-readable data in the appropriate language.

{Andrea's note: I put the below paragraph in because I think it bears mentioning}
The charset of the data can also be specified as a parameter of the Content-Type 
header.  However, it is better to specify the SOAP document charset inside the 
document itself, rather than to rely on the transport mechanism to be the sole 
mechanism for identifying the charset.  If the charset specified in HTTP doesn't 
agree with the charset inside the document, then the receiver must make a 
decision on how to resolve the problem.

4.16.2 FTP

File Transfer Protocol (FTP) is a simple transport mechanism that can be used
for Web service documents.  The main international consideration in using FTP is
to specify the representation type as I (Image), allowing 8-bit values to pass
unchanged through the transfer.

File names, path names, and character encoding issues may intrude here.

4.16.3 SMTP

Simple Mail Transfer Protocol (SMTP) has no particular provisions for 
international data.  SMTP itself is limited to 7-bit data, but can transport 
8-bit data.  Its main restriction is an 8-bit gateway; that is, encodings such 
as UTF-16 and UTF-32 may not be successfully transmitted and should be avoided.

4.16.3.1 MIME Tags

Multipurpose Internet Mail Extensions (MIME) tags are necessary for a multipart 
SOAP request, for example, a SOAP message with an attachment.  MIME contains a 
number of headers which may be used for international data.

{somewhere we need to reference RFCs 2045-9, maybe as a MIME reference in the 
reference section}

MIME can be useful for identifying the charset of attachments which do not 
identify their own charset inside the attachment.  Examples of such attachments 
are plain text documents which cannot contain a charset tag and legacy markup 
documents which do not contain a charset tag by omission.  If the attachment 
contains an internal charset tag, the MIME charset parameter should be omitted 
to avoid an inadvertent mismatch.

MIME can also contain a Content-Language tag.  While it is better to indicate 
the document language inside the document itself, sometimes it isn't possible. 
For example, if there is an image attachment which contains embedded text, the 
Content-Language header can provide the language id.

See the example in 4.5.2 Character Coding of Attachments.
{Andrea's note:  add the following line to the attachments example in 4.5.2 in 
the attachment MIME headers after Content-Type -

Content-Language: fr

This avoids having to make a separate example which would be very similar.}

4.16.4 IRIs, URIs, and fun stuff
{Martin's text here :-) }

Received on Wednesday, 14 April 2004 14:33:04 UTC