- From: Cameron McCormack <cam@mcc.id.au>
- Date: Fri, 3 Aug 2007 10:01:34 +1000
- To: public-webapi@w3.org
Hi Chris. Chris Lilley: > In the specification > Language Bindings for DOM Specifications > > http://dev.w3.org/cvsweb/~checkout~/2006/webapi/Binding4DOM/Overview.html?rev=1.50&content-type=text/html;%20charset=utf-8#referencing > > there is no mention of a string type. Since IDL is deficient in this > area and since there is a need for careful documentation of what > exactly is intended (16bit units that are mostly characters, except > when they are half of a surrogate pair) I suggest that you add a > section that is completely compatible with Dom Level 3 Core definition > of a string type: > > http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578 Yeah, I’m not happy with how equivocal the document is about strings at the moment. DOM Core defines (optional) strings to be represented by boxed sequence<unsigned short> valuetypes, so this type has to be a special case in the language bindings. There is a section in the type mappings for both the ECMAScript and Java bindings (4.1.13 and 5.2.13) that deal with sequence<unsigned short>, but there is no equivalent section in the IDL section. However I did need to mention it in the descriptions for [NameGetter] and [NameSetter], so obviously it’s not sufficient to ignore it in the IDL section. The special casing doesn’t sit right with me, since what if you actually wanted a sequence of unsigned short values and not just a string? In that link you provide, it says: Note: As of August 2000, the OMG IDL specification ([OMG IDL]) included a wstring type. However, that definition did not meet the interoperability criteria of the DOM API since it relied on negotiation to decide the width and encoding of a character. Reading CORBA 3.0’s OMG IDL Syntax and Semantics chapter, I cannot see any mention of width/encoding negotiation. What it does say is: 3.11.1.4 Wide Char Type OMG IDL defines a wchar data type that encodes wide characters from any character set. As with character data, an implementation is free to use any code set internally for encoding wide characters, though, again, conversion to another form may be required for transmission. The size of wchar is implementation-dependent. … 3.11.3.3 Wstrings The wstring data type represents a sequence of wchar, except the wide character null. The type wstring is similar to that of type string, except that its element type is wchar instead of char. The actual length of a wstring is set at run-time and, if the bounded form is used, must be less than or equal to the bound. I was going to suggest perhaps allowing wstring representations to be binding language specific, as long as it can represent all possible Unicode characters, but then there are DOM methods that specifically operate on sequences of UTF-16 code points (e.g. CharacterData.insertData). I think it might be OK to further restrict wchars to be 16 bit values, and for wstrings to be encoded in UTF-16. But what might be more troublesome is the explicit restriction that null characters cannot be included in wstrings. Another possible solution would be to have an extended attribute that indicates that the given type is the string type, e.g.: [StringType] valuetype DOMString sequence<unsigned short>; and an IDL fragment could be restricted to having only a single type annotated thus. Then [NameGetter] and [NameSetter] would have something better to hook on to. But I’m really not sure what the cleanest way out of the situation is. > (unless there is some reason a string type is excluded, or unless all > the requirements of DOM 3 Core are already included by reference and I > didn't notice). No, DOM 3 Core isn’t included by reference. I would imagine that DOM 3 Core second edition would explicitly reference the Bindings spec, or that the Bindings spec guides readers on how to interpret already-published IDL such as in DOM 3 Core. -- Cameron McCormack, http://mcc.id.au/ xmpp:heycam@jabber.org ▪ ICQ 26955922 ▪ MSN cam@mcc.id.au
Received on Friday, 3 August 2007 00:01:44 UTC