Re: support for non-ASCII in strings, was: signatures vs sf-date from Julian Reschke on 2022-12-03 (ietf-http-wg@w3.org from October to December 2022)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sat, 3 Dec 2022 09:37:39 +0100
To: ietf-http-wg@w3.org
Message-ID: <870c907b-6972-5a8d-9ee7-2591cd2c7306@gmx.de>
On 03.12.2022 09:08, Mark Nottingham wrote:
> ...
>> If you're serious about human presentable text not belonging here, why
>> do we add that to "Problem" right now?
>
> Please re-read what I wrote. Discussing them does not obviate what I said.

I still don't get it. You say "text for display to humans should not go
into a field" (right?), but then you co-author a spec which does just that?

>>> There are some cases where non-ASCII strings are needed in header fields; mostly, when you're presenting something to a human from the fields. Those cases are not as common. However, there's a catch to adding them: if full unicode strings were available in the protocol, many designers will understandably use them because it's been drilled into all our heads that unicode is what you use for strings.
>>>
>>> Hence, footgun.
>>
>> I would appreciate if you would explain why there is a problem we need
>> to prevent, and what exactly that problem is. Do you have an example?
>
> As you've pointed out, the scope for this bis document was tightly defined. The onus isn't on me to prove what shouldn't go into it...

I asked about what the footgun is. You replied that this is out of
scope. That's not helpful.

>>> By leaving full unicode support out of the spec and forcing designers to take positive steps to support it, the (relatively small) barrier to adoption makes them stop and think whether they need it. I think that's a good thing. I also know that will make some i18n folks unhappy, and I'm sorry for that; unfortunately we're working in an area where protocol artefacts intended for humans and machines are mixed, and so it gets difficult.
>>
>> I continue to disagree. By not supporting non-ASCII in the base
>> definition, we force people to come up with ad hoc definitions which in
>> general will be worse than a common extension we can define here.
>>
>>> All of that said, once the algorithms are stable (as Julian has pointed out, they contain some errors), I wouldn't object to including the %-encoding text as an appendix in sf-bis with appropriate warnings, if other folks are amenable.
>>
>> That would be a good step into the right direction. I still think we
>> need an on-the-wire signal that the encoding is in place, for the same
>> reasons why we're doing this revision in the first place (tooling
>> support for special-casing integers that happen to represent dates).
>
> I disagree, and you should have brought that up in the scoping discussion.

This suggestion was about a change you just proposed, and which isn't in
the charter either, right?

Anyway: there were several discussions and projects in the last weeks
that made it clear to me that layering things on top or "extending"
Structured Fields is problematic:

1. The discussion related to retrofit and defining a relaxed parser.

2. The discussion about the percent encoding in the "Problem field" spec
(which is very recent)

3. The discussion about what the introduction of sf-date means for
message-signatures.

4. And, in implementations, the question how to configure the SF parser
(support for retrofit, support for sf-date - are these feature flags?
can they be combined arbitrarily???, can the behavior of the parser
change defaults without the caller making an explicit choice?).

Based on that, I believe we either need to work on these points now, or
have a credible strategy how to update this specs *and* the specs
depending on it later.

Best regards, Julian

PS: the discussion about non-ASCII dates back four years. There was
(IMHO) only rough consensus not to do it, and I was in the rough (not
alone, for that matter). The fact that HTTPAPI's problem field spec now
defines a workaround (and, FWIW, yet another way to address this issue)
IMHO is a sign that that decision for RFC 8941 was bad, and we should
revisit it.
Received on Saturday, 3 December 2022 08:37:53 UTC