- From: A. Vine <andrea.vine@Sun.COM>
- Date: Wed, 14 Apr 2004 11:12:10 -0700
- To: I18n WSTF <public-i18n-ws@w3.org>
Lots of modification should happen to this section, but I am not the right person to make them all. Someone at the F2F (Tex maybe?) had a lot more relevant information on FTP than I do. I'm happy to write it up if you give me the info or a pointer. I am not very happy with my intro - comments, changes, rewrites welcome. I think 4.16.4 "IRIs, URIs, and fun stuff" would be better written by Martin. I could take a shot at the "fun stuff" part ;-} -------------------------------------------------------- 4.16 Transports Web services may use a variety of transport technologies and protocols. Many of these have parameters defined for data identification. These parameters are necessary for proper processing of international data. The specifics of several transport protocols are discussed in this section. 4.16.1 HTTP Accept-Language {Andrea's note: what about Content-Language? And a blurb on Content-Type is added below. Maybe we should call this section "HTTP".} The HyperText Transport Protocol (HTTP) is often used for Web service message transport. HTTP contains some header fields which are useful for identifying sender preferences and capabilities. One of those fields is Accept-Language. Accept-Language takes one or more language identifiers in RFC3066 (or its replacement) format as its parameters. Each language identifier can have a quality value which gives a relative priority. Here is an example: {Andrea's note: the below should be set off in the example format, or indented and in a different font, or something} Accept-Language: zh-cn, fr-ch;q=0.8, fr;q=0.7 The above could be read as "Simplified Chinese is preferred, but Swiss French is acceptable, as are other types of French." There is more information about the handling of Accept-Language in the HTTP 1.1 specification. A Web service requester using HTTP can include an Accept-Language field to indicate the languages preferred. The provider can then take that information and use it to return human-readable data in the appropriate language. {Andrea's note: I put the below paragraph in because I think it bears mentioning} The charset of the data can also be specified as a parameter of the Content-Type header. However, it is better to specify the SOAP document charset inside the document itself, rather than to rely on the transport mechanism to be the sole mechanism for identifying the charset. If the charset specified in HTTP doesn't agree with the charset inside the document, then the receiver must make a decision on how to resolve the problem. 4.16.2 FTP File Transfer Protocol (FTP) is a simple transport mechanism that can be used for Web service documents. The main international consideration in using FTP is to specify the representation type as I (Image), allowing 8-bit values to pass unchanged through the transfer. File names, path names, and character encoding issues may intrude here. 4.16.3 SMTP Simple Mail Transfer Protocol (SMTP) has no particular provisions for international data. SMTP itself is limited to 7-bit data, but can transport 8-bit data. Its main restriction is an 8-bit gateway; that is, encodings such as UTF-16 and UTF-32 may not be successfully transmitted and should be avoided. 4.16.3.1 MIME Tags Multipurpose Internet Mail Extensions (MIME) tags are necessary for a multipart SOAP request, for example, a SOAP message with an attachment. MIME contains a number of headers which may be used for international data. {somewhere we need to reference RFCs 2045-9, maybe as a MIME reference in the reference section} MIME can be useful for identifying the charset of attachments which do not identify their own charset inside the attachment. Examples of such attachments are plain text documents which cannot contain a charset tag and legacy markup documents which do not contain a charset tag by omission. If the attachment contains an internal charset tag, the MIME charset parameter should be omitted to avoid an inadvertent mismatch. MIME can also contain a Content-Language tag. While it is better to indicate the document language inside the document itself, sometimes it isn't possible. For example, if there is an image attachment which contains embedded text, the Content-Language header can provide the language id. See the example in 4.5.2 Character Coding of Attachments. {Andrea's note: add the following line to the attachments example in 4.5.2 in the attachment MIME headers after Content-Type - Content-Language: fr This avoids having to make a separate example which would be very similar.} 4.16.4 IRIs, URIs, and fun stuff {Martin's text here :-) }
Received on Wednesday, 14 April 2004 14:33:04 UTC