W3C home > Mailing lists > Public > xml-dist-app@w3.org > May 2003

RE: encoding missing in xml declaration

From: Martin Gudgin <mgudgin@microsoft.com>
Date: Mon, 12 May 2003 01:59:18 -0700
Message-ID: <7C083876C492EB4BAAF6B3AE0732970E0B6DF3E6@red-msg-08.redmond.corp.microsoft.com>
To: "Aman Singh" <haramansingh@hotmail.com>, <xml-dist-app@w3.org>

Moving to xml-dist-app for discussion 

> -----Original Message-----
> From: xmlp-comments-request@w3.org 
> [mailto:xmlp-comments-request@w3.org] On Behalf Of Aman Singh
> Sent: 09 May 2003 18:33
> To: xmlp-comments@w3.org
> Subject: encoding missing in xml declaration
> 
> 
> 
> 
> 
> Dear Sir/Madame:
> 
> In the document SOAP Version 1.2 Part 0: Primer with status 
> of Proposed Recommendation, I noted the following issue.
> 
> In Example 4, the xml declaration is <?xml version='1.0' ?> 
> without any encoding attribute, therefore the value of 
> encoding defaults to utf-8.  

Correct.

> Within the same soap message, an element is found with french 
> characters.
> 
> <n:name xmlns:n="http://mycompany.example.com/employees">
>            ke Jgvan yvind
> </n:name>

UTF-8 is able to encode all Unicode characters. 

According to The Unicode Standard Version 3.0, the character codes are as follows:

----------------------------------------
| Glyph | Hex Code | Bit pattern       |
----------------------------------------
|      |  00C5    | 11000011 10000101 |
----------------------------------------
|      |  00F3    | 11000011 10110011 |
----------------------------------------
|      |  00D8    | 11000011 10011000 |
----------------------------------------

> 
> This is incorrect according to the XML 1.0 Recommendation 
> unless the characters are escaped with the values.  

I do not understand how you draw this conclusion. XML 1.0 only requires that characters be escaped if they cannot be encoded natively in the given encoding. As UTF-8 can encode all of Unicode, not escaping is needed.

> According 
> to my knowledge, two things could be done at this point by 
> modifying Example 4's text:
> 
> 1.) Add an encoding attribute to the xml declaration <?xml 
> version='1.0' 
> encoding='ISO-8859-1' ?>
> 2.) Change the element to
> <n:name xmlns:n="http://mycompany.example.com/employees">
>           &#197;ke J&#243;gvan &#216;yvind </n:name>
> 
> making it a well formed xml document (due to assumption of 
> encoding="utf-8")

I think the example is fine as is.

Gudge
Received on Monday, 12 May 2003 04:59:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:14 GMT