Re: [heycam/webidl] Added a definition of {{StringContext}} extended attribute. (#841) from Krzysztof Kotowicz on 2020-02-14 (public-webapps-github@w3.org from February 2020)

From: Krzysztof Kotowicz <notifications@github.com>
Date: Fri, 14 Feb 2020 10:32:34 -0800
To: heycam/webidl <webidl@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <heycam/webidl/pull/841/c586413864@github.com>
> > The move to IDL type mapping is to avoid having type unions and branching on the types in each operation.

> I'm not quite following this, sorry. Can you explain further what you mean here?

To my understading, previous version required us to use: 

```
partial interface mixin HTMLIFrameElement : HTMLElement {
  [TrustedTypes=TrustedHTML] attribute (DOMString or TrustedHTML) srcdoc;
};
```
and then the setter for `srcdoc` would have to branch on the value type. Perhaps I was misunderstanding to which extent the extended attributes are allowed to interfere with the value type, and the type union there is actually unneccessary, and the extended attribute can, in the end, cause the setter to abort early or accept a DOMString.

> Now that I think I understand the setup (though I thought that before too!), it seems like the possible places to put this are:
> 
> 1. Inside DOMString/USVString conversions, and thread through the information those need, like which operation/attribute the conversion is for.  This could happen _before_ or _after_ conversion to an IDL value.  Looks like this PR does it "before".

Correct. IIUC doing this "after" https://heycam.github.io/webidl/#es-DOMString step 2 is undesirable, as we'd lose the JS type information. The logic for the sanitization of values is roughly:

```
if TTNotEnforced:
  return v.toString()

if isCorrectTrustedType(v, context):
  return v.toString();
elseif realm.defaultPolicy:
  v = realm.defaultPolicy.sanitize(v, context) // this might throw.
return v.toString() 
```

so the JS type information needs to be available at the time of the checks (in HTML or TT or CSP).

> 2. At the places where args to operations or setters are converted.  That would correspond to https://heycam.github.io/webidl/#es-overloads step 11.5 and step 15.5, and https://heycam.github.io/webidl/#dfn-attribute-setter step 6.
> 3. After conversion but before invoking the spec-described steps.  That would correspond to https://heycam.github.io/webidl/#dfn-create-operation-function step 2.1.8 (well, after 2.1.5 and before 2.1.8), and the https://heycam.github.io/webidl/#dfn-attribute-setter step 7 (or rather beween steps 6 and 7).
> 4. At the front of the spec-described steps, whether via extended attribute or manually.  Manually seems undesirable.

This seems roughly what the previous iteration did. The downside, apart from the type branching required (we need to preserve the JS type for the sanitization check), was exactly what you point out: 

> When other specs invoke the setter or operation directly, not via the JS reflection, option 4 will lead to the values they pass being sanitized


The fact that the invoking the setter from outside the JS bypasses the sanitizing is actually desirable. For example, that allows the HTML parser algorithm to create the DOM elements without needing to carve out the exception for it (we are guarding e.g. innerHTML setter, but when the string actually gets allowed to be parsed - or is parsed after navigation - arbitrary nodes with arbitrary attributes are allowed). In other words, Trusted Type try to prevent the JS code from interacting with risky APIs, not the browser algorithms.

> Now I have my own opinions of how these various options compare, but I'd love a summary of what you view as the tradeoffs and what led to choosing option 1 with the "before" behavior, especially if this is already written down somewhere. In terms of author-observable behavior, there are three different behaviors here: "option 1 with 'before'" is one, "option 1 with 'after'" and "option 2" are another, I think (in that they behave identically), and "options 3"/"option 4", which also behave identically.

The intended behavior is that the sanitization logic gets access to the JS type used, for the value to skip the sanitization ste). The simplest way I see it to hook that as early as possible, and convert to the IDL type later. 

 - Option 1 with "before" allows us to do that by reusing the DOMString / USVString type. 
 - Option 1 with "after" loses the type, IIUC.
 - Option 2 actually looks OK too. What is exactly the author-observable difference to option 1 + before? 
- Option 3 loses the type, similarly option 4. We can have the type information if we use a type union in the IDL definition of the attribute / operation. 

I don't think there's something fundamentally wrong with using type unions in IDL definitions for our purpose - the annotated type route was taken as a preferred option for simplicity (one place to change things, and upstream specs remain to only see DOMStrings). Perhaps @domenic can weigh in here.

> Note that I did read through the two issues listed in the original PR, but there is a _lot_ of back-and-forth in there and it wasn't clear to me exactly why we landed on this outcome.

Thanks a lot, and sorry for looping you in that late into the discussion. Some of the them were triggered internally, so it's indeed not clear from GitHub alone why certain approaches were taken. I also am definitely not an expert in IDL, and I learn as I go. Thank you for taking your time to review this.

> Just to clarify [...] in this case [`void foo([StringContext] (DOMString or Node) arg)`] the sanitizer should be invoked if the passed-in thing is not a Node, but not invoked if it is a Node?

Good point. I didn't consider this case, but indeed it would be fine for the sanitizer to be skipped *if* the extra type was included in the IDL. In the case of `[StringContext] DOMString arg` a Node should undergo JS->IDL `DOMString` conversion, with sanitization invoked.


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/heycam/webidl/pull/841#issuecomment-586413864
Received on Friday, 14 February 2020 18:32:47 UTC