[webidl] Numeric type reform strawperson (#33)

Inspired by [bug 26901](https://www.w3.org/Bugs/Public/show_bug.cgi?id=26901) and @marcoscaceres's #14, I somehow ended up writing a detailed proposal for reforming the numeric types.

# WebIDL Numeric Type Reform Strawperson

## Motivation

The existing WebIDL numeric types are problematic in a number of ways.

First, they create a misleading parallel with numeric types in other languages, where there is actually a proper numeric type system. This encourages people to use them in situations where they might in other languages, creating non-JavaScript-idiomatic APIs. Most JavaScript APIs will want one of a few things: any numeric value at all; any finite numeric value (possibly restricted to integers); or possibly more complicated validation or clamping logic within a given range. WebIDL does provide some facilities for the latter, but only if your range is based on powers of two; ranges like 0 100, -180 to +180, or similar are not supported and require prose.

Second, they have the usual WebIDL problem of using the type system for two different purposes: coercions from JavaScript values, as in the cases of parameter lists and dictionaries, and documentation, as in the case of return values and constants. For the documentation cases, the proliferation of numeric types is simply confusing; declaring a return type as an `unsigned long` is meaningless given that it will be exposed as a normal JavaScript number (i.e. double-precision floating point value). It would be better to use a generic `number` type for those purposes, and create types that emphasize the potential coercion or validation strategies for use in those scenarios.

Finally, the spec is bloated with repetitive and spread-out text for performing the different numeric type coercions. The attributes `[EnforceRange]` and `[Clamp]` modifying the steps so that the types come to have drastically different behaviors when they are applied.

My proposed solution is to try to give tools that are both more aligned with spec-authoring use cases and clearer about what they are accomplishing. The complete functionality of the current types is preserved, and in fact further use cases (such as custom ranges) are enabled.

The proposal consists of removing all existing numeric types, as well as the `[Clamp]` and `[EnforceRange]` extended attributes, and replacing them with:

- A new `number` type, along with a `[EnforceFinite]` extended attribute that can apply to it, for documentation cases and for the most generic floating-point number processing.
- Two new parametrized types, `IntEnforceWithin<x, y>` and `IntClampRound<x, y>`, which are expected to be broadly useful for defining integer inputs that must stay within a specific range.
- Two new parametrized types, `IntMod<x, y>` and `UintMod<y>`, which are expected to be less useful and are mainly kept to enable matching legacy semantics.
- A to-be-determined set of typedefs to make the common cases that appear across specs more convenient.

## Removal

We are trying to reform the following parts of WebIDL. So first, picture a universe without:

### All existing numberic types

- byte
- octet
- short
- unsigned short
- long
- unsigned long
- long long
- unsigned long long
- float
- unrestricted float
- double
- unrestricted double

### The numeric modifier extended attributes

- [EnforceRange]
- [Clamp]

## Additions

Now let's get that funcitonality back, but better/faster/stronger.

### `number` type and `[EnforceFinite]` extended attribute

The `number` type is the type that corresponds most closely to a JavaScript number. It is to be used:

- By all places that require no coercion, e.g. return types or constants
- For parameters, dictionary entries, etc. that do not need to enforce any particular behavior, but simply want to run `ToNumber`.

The `[EnforceFinite]` extended attribute is used to disallow **NaN**, **+∞**, or **−∞**.

The algorithm for converting an ES value _v_ to a WebIDL `IntWithin<x, y>` is:

1. Let _n_ be ToNumber(_v_).
2. If the conversion to an IDL value is being performed in the presence of a `[EnforceFinite]` extended attribute, then
    1. If _n_ is **NaN**, **+∞**, or **−∞**, throw a **RangeError**.
3. Return _n_.

This can express the following old patterns:

```
double              →  [EnforceFinite] number
unrestricted double →  number
```

TODO: where are the existing `float` and `unrestricted float` types used? How do we express those? What are their semantics anyway?

### More-useful parametrized coercion types

Several new parametrized types are introduced specifically for use in places that require coercions (parameter lists, dictionary entries, setters). They are expected to be broadly useful. Unlike the previous types, they are not restricted to a predefined set of upper and lower limits. After all, it seems arbitrary to imagine that [-2147483648, 2147483647] is a more useful range than [0, 100] or [0, 360].

#### `IntEnforceWithin<x, y>`

Similar to today's `[EnforceRange]`. The algorithm for converting an ES value _v_ to a WebIDL `IntWithin<x, y>` is:

1. Let _n_ be ToNumber(_v_).
2. If _n_ is **NaN**, **+∞**, or **−∞**, throw a **RangeError**.
3. Set _n_ to sign(_n_) * floor(abs(_n_)).
4. If _n_ < _x_ or _n_ > _y_, throw a **RangeError**.
5. Return _n_.

This can express the following old patterns:

```
[EnforceRange] byte               →  IntEnforceWithin<-128, 127>
[EnforceRange] octet              →  IntEnforceWithin<0, 255>
[EnforceRange] short              →  IntEnforceWithin<-32768, 32767>
[EnforceRange] unsigned short     →  IntEnforceWithin<0, 65535>
[EnforceRange] long               →  IntEnforceWithin<-2147483648, 2147483647>
[EnforceRange] unsigned long      →  IntEnforceWithin<0, 4294967296>
[EnforceRange] long long          →  IntEnforceWithin<-9007199254740991, 9007199254740991>
[EnforceRange] unsigned long long →  IntEnforceWithin<0, 9007199254740991>
```

#### `IntClampRound<x, y>`

Similar to today's `[Clamp]`. The algorithm for converting an ES value _v_ to a WebIDL `IntClamp<x, y>` is:

1. Let _n_ be ToNumber(_v_).
2. Set _n_ to min(max(_n_, _x_), _y_).
3. Round _n_ to the nearest integer, choosing the even integer if it lies halfway between two, and choosing +0 rather than −0.
4. Return _n_.

This can express the following old patterns:

```
[Clamp] byte               →  IntClampRound<-128, 127>
[Clamp] octet              →  IntClampRound<0, 255>
[Clamp] short              →  IntClampRound<-32768, 32767>
[Clamp] unsigned short     →  IntClampRound<0, 65535>
[Clamp] long               →  IntClampRound<-2147483648, 2147483647>
[Clamp] unsigned long      →  IntClampRound<0, 4294967296>
[Clamp] long long          →  IntClampRound<-9007199254740991, 9007199254740991>
[Clamp] unsigned long long →  IntClampRound<0, 9007199254740991>
```

### Less-useful parametrized coercion types

Several parametrized coercion types are introduced simply to be able to maintain old semantics. They should probably not be used.

#### `IntMod<x, y>`

Used for emulating today's signed types. The algorithm for converting an ES value _v_ to a WebIDL `IntMod<x, y>` is:

1. If _n_ is **NaN**, **+0**, **−0**, **+∞**, or **−∞**, return **+0**.
2. Set _n_ to sign(_n_) * floor(abs(_n_)).
3. Set _n_ to _n_ modulo _x_.
4. If _n_ is ≥ _y_, set _n_ to _n_ - _x_.
5. Return _n_.

This can express the following old patterns:

```
byte      →  IntMod<256, 128>
short     →  IntMod<65536, 32768>
long      →  IntMod<4294967296, 2147483648>
long long →  IntMod<18446744073709552000, 9223372036854776000>
```

#### `UintMod<x>`

Used for emulating today's unsigned types. The algorithm for converting an ES value _v_ to a WebIDL `UintMod<x>` is:

1. If _n_ is **NaN**, **+0**, **−0**, **+∞**, or **−∞**, return **+0**.
2. Set _n_ to sign(_n_) * floor(abs(_n_)).
3. Return _n_ modulo _x_.

This can express the following old patterns:

```
octet              →  UintMod<256>
unsigned short     →  UintMod<32768>
unsigned long      →  UintMod<2147483648>
unsigned long long →  UintMod<9223372036854776000>
```

`UintMod<360>` might also be useful for any functions that want to process degrees.

## Making these more convenient

### Typedefs

We should do an audit to find what range coericons are commonly used on web specs, and define typedefs for them. My _hope_ is that e.g. `long long` is used infrequently and mistakenly, so we wouldn't need a typedef of that sort; the spec could define its own, or use the awkward long name anyway. @bzbarsky [says](https://www.w3.org/Bugs/Public/show_bug.cgi?id=26901#c8) that at a cursory glance `byte` is unused. Etc.

We should almost certainly define a simple integer typedef. Unsure which is most idiomatic, but it should go up to 2<sup>53</sup> - 1. We would also likely want one for 0–255.

Names for these typedefs are undecided. They could use the existing names, although I am wary that these names give the mistaken impression of some correlation to a real type system.

### Allowing power notation

Although again I hope that people aren't using the random-power-of-two ranges very often in their parameter coercions, if they are, we could allow e.g. `UintMod<2^64>` or `UintMod<2**64>` to replace `UintMod<9223372036854776000>`.

### Shorter, less-precise names for the more-useful ones?

E.g. instead of `IntEnforceWithin<x, y>` we could do `IntWithin<x, y>` or even `Int<x, y>`. Instead of `IntClampRound<x, y>` we could do `IntClamp<x, y>`. Or we could get really cryptic with e.g. `Int!<x, y>` vs. `Int_<x, y>`.


---
Reply to this email directly or view it on GitHub:
https://github.com/heycam/webidl/issues/33

Received on Friday, 5 December 2014 19:04:16 UTC