W3C home > Mailing lists > Public > public-webapi@w3.org > August 2007

Re: Language Bindings for DOM Specifications and the DOMString type

From: Cameron McCormack <cam@mcc.id.au>
Date: Fri, 3 Aug 2007 10:01:34 +1000
To: public-webapi@w3.org
Message-ID: <20070803000134.GA26197@arc.mcc.id.au>

Hi Chris.

Chris Lilley:
> In the  specification
> Language Bindings for DOM Specifications
> 
> http://dev.w3.org/cvsweb/~checkout~/2006/webapi/Binding4DOM/Overview.html?rev=1.50&content-type=text/html;%20charset=utf-8#referencing
> 
> there is no mention of a string type. Since IDL is deficient in this
> area and since there is a need for careful documentation of what
> exactly is intended (16bit units that are mostly characters, except
> when they are half of a surrogate pair) I suggest that you add a
> section that is completely compatible with Dom Level 3 Core definition
> of a string type:
>
> http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578

Yeah, I’m not happy with how equivocal the document is about strings at
the moment.  DOM Core defines (optional) strings to be represented by
boxed sequence<unsigned short> valuetypes, so this type has to be a
special case in the language bindings.  There is a section in the type
mappings for both the ECMAScript and Java bindings (4.1.13 and 5.2.13)
that deal with sequence<unsigned short>, but there is no equivalent
section in the IDL section.  However I did need to mention it in the
descriptions for [NameGetter] and [NameSetter], so obviously it’s not
sufficient to ignore it in the IDL section.

The special casing doesn’t sit right with me, since what if you actually
wanted a sequence of unsigned short values and not just a string?

In that link you provide, it says:

  Note: As of August 2000, the OMG IDL specification ([OMG IDL])
  included a wstring type. However, that definition did not meet the
  interoperability criteria of the DOM API since it relied on
  negotiation to decide the width and encoding of a character.

Reading CORBA 3.0’s OMG IDL Syntax and Semantics chapter, I cannot see
any mention of width/encoding negotiation.  What it does say is:

  3.11.1.4 Wide Char Type

  OMG IDL defines a wchar data type that encodes wide characters from
  any character set. As with character data, an implementation is free
  to use any code set internally for encoding wide characters, though,
  again, conversion to another form may be required for transmission.
  The size of wchar is implementation-dependent.

  …

  3.11.3.3 Wstrings

  The wstring data type represents a sequence of wchar, except the wide
  character null.  The type wstring is similar to that of type string,
  except that its element type is wchar instead of char. The actual
  length of a wstring is set at run-time and, if the bounded form is
  used, must be less than or equal to the bound.

I was going to suggest perhaps allowing wstring representations to be
binding language specific, as long as it can represent all possible
Unicode characters, but then there are DOM methods that specifically
operate on sequences of UTF-16 code points (e.g.
CharacterData.insertData).

I think it might be OK to further restrict wchars to be 16 bit values,
and for wstrings to be encoded in UTF-16.  But what might be more
troublesome is the explicit restriction that null characters cannot be
included in wstrings.

Another possible solution would be to have an extended attribute that
indicates that the given type is the string type, e.g.:

  [StringType]
  valuetype DOMString sequence<unsigned short>;

and an IDL fragment could be restricted to having only a single type
annotated thus.  Then [NameGetter] and [NameSetter] would have something
better to hook on to.

But I’m really not sure what the cleanest way out of the situation is.

> (unless there is some reason a string type is excluded, or unless all
> the requirements of DOM 3 Core are already included by reference and I
> didn't notice).

No, DOM 3 Core isn’t included by reference.  I would imagine that DOM 3
Core second edition would explicitly reference the Bindings spec, or
that the Bindings spec guides readers on how to interpret
already-published IDL such as in DOM 3 Core.

-- 
Cameron McCormack, http://mcc.id.au/
	xmpp:heycam@jabber.org  ▪  ICQ 26955922  ▪  MSN cam@mcc.id.au
Received on Friday, 3 August 2007 00:01:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:18:58 GMT